- Influence/Virus/Label Propagation
- Big Data Graph Databases
- Big Data Graph Processing
- Social Network Analysis
- Learn From: Social Network AnalysisURL
- Some other topics before relates to this topic. https://hpi.de/fileadmin/user_upload/fachgebiete/mueller/courses/graphmining/GraphMining-02-Social-Network-Analysis.pdf
- Querying Graphs: Isomorphic Graphs
- Learn From: Querying Graphs: Isomorphic GraphsURL
- https://hpi.de/fileadmin/user_upload/fachgebiete/mueller/courses/graphmining/GraphMIning-03-Querying-Graphs.pdf
- Mining graph patterns
- Learn From: Mining graph patternsURL
- https://hpi.de/fileadmin/user_upload/fachgebiete/mueller/courses/graphmining/GraphMining-04-FrequentSubgraph.pdf
- Graph Mining and Classifications
- Tools and Examples
- List 1: Top 30 Social Network Analysis and Visualization ToolsURL
- Commetrix, Cuttlefish, Cytoscape, EgoNet, Gephi, Graph-tool, GraphChi, Graphviz
- List 2: Graph Analytics ToolsURL igraph, NetworkX, graph-tool, Gephi, SNAP, JUNG, Mathematica, D3.js, Cytoscape
- Probable Assignments
- Graph Datasets for Assignments and Projects; also for researchPage
- You will see many datasets on the URL: You can apply your implementation on a dataset/graph of your choice though make sure the properties (directed/undirected/weighted/unweighted/connected/not-connected and others) match with the question.
- Python or R Libraries for Graph Mining ImplementationsPage
- https://networkx.github.io/documentation/latest/_downloads/networkx_reference.pdfhttps://networkx.github.io/documentation/stable/reference/introduction.html
- List 1: Implement the following algorithms and apply on large graphs (in Plain Python or R without Graph Libraries)Page
- Click the above link for details: BFS, DFS, Single Source Shortest Path, All pair shortest path, Karger's Algorithm, Min-Cut Algorithm, Cliques Algorithms, Dijiktra’s Shortest Path implementation (for shortest paths or the longest paths)
More will be added later .. any algorithm that you will come across can be an assignment as well. You might try to study and implement the most important algorithms that are used practically and are famous (or solve important problems) - List 2: Use Graph Algorithms provided by NetworkX (Python, R) librariesPage
- Assignments related to demonstrating the capability to be able to use the Graph Algorithms provided in the NetworkX library. Such as find the related library methods/algorithm as mentioned in list 1 and apply on the same datasets : do you get the same results.
- List 3: Spanning Tree, Connected SubgraphsPage
- In Short: Implement Spanning Tree, and Highly Connected Subgraphs
- List 4: Based on Graph Properties and ConceptsPage
- Given a graph (use a data-set or a small graph first) then apply on large graphs. Write Python or R code to Identify the isolated nodes in the graph, count bi-directional edges in the graph, identify top 10 vertices based on in-degrees, identify top 100 vertices based on their out-degrees, count the number of cliques in the Graph, identify number of disconnected subgraphs.
- List 5: Centrality-PageRank-Betweenness
- For a Graph/graph-dataset such as political blogs (see example dataset section), implement Vertex Betweenness, Edge Betweenness, Closeness Centrality.
Implement Page ranking Algorithms such as HITS and Anchor/Hubs
- Potential Project Ideas
- List 1: Project Ideas with real life applicationsPage
- From: https://cs.stanford.edu/people/jure/talks/www08tutorial/, you need to find datasets to implement these ideas. Check the above URLs for DataSets (I did not check yet)
Part 4: Case studies- Communication patterns of MSN Messenger. The application of above mentioned tools and algorithms to a large network of communication on MSN Instant Messenger (30 billion conversations, 240 million people).
- Detecting fraud on eBay. How to find fraudulent people on eBay. We present a belief propagation method that is able to find fraudulent people in large networks.
- Monitoring social and communication networks over time -- intrusion and outlier detection. An application of tensor decomposition techniques to monitor multiple time series over time and detect outliers and abnormal events
- Web projections. Exploiting the structure of web graph to predict the quality of search results, user intention to reformulate queries and to find spam search results.
- Connection subgraphs and CenterPiece subgraphs.
- List 2: Check the Data Science Competitions to get Project IdeasPage
- Click above and Check the Data Science Competitions to get Project Ideas. For example, can you map current dengue spread into a graph, and predict the path for how the disease will spread?
- List 3: Take a Google Function Area in Graph Mining and Apply on Very Large Graphs and see what insight you can getPage
- You can as well do research in any of these areas to improve the algorithms and performance as well as apply on new problems/challenges.Google Job Areas: Large-Scale Balanced Partitioning: Example Google Maps Driving Directions, Large-Scale Clustering:clustering graphs at Google scale, Large-Scale Connected Components, Large-Scale Link Modeling: similarity ranking and centrality metrics: link prediction and anomalous link discovery., Large-Scale Similarity Ranking: Personalized PageRank, Egonet similarity, Adamic Adar, and others, Public-private Graph Computation, Streaming and Dynamic Graph Algorithms, ASYMP: Async Message Passing Graph Mining, Large-Scale Centrality Ranking, Large-Scale Graph Building
- Potential Research Topics
- List 1: Research on Google Job/Function areas in Graph MiningPage
- Take one of these Google job/function areas in Graph Mining and improve the algorithms for performance as well as apply on new problems/challenges.Google Job Areas: Large-Scale Balanced Partitioning: Example Google Maps Driving Directions, Large-Scale Clustering:clustering graphs at Google scale, Large-Scale Connected Components, Large-Scale Link Modeling: similarity ranking and centrality metrics: link prediction and anomalous link discovery., Large-Scale Similarity Ranking: Personalized PageRank, Egonet similarity, Adamic Adar, and others, Public-private Graph Computation, Streaming and Dynamic Graph Algorithms, ASYMP: Async Message Passing Graph Mining, Large-Scale Centrality Ranking, Large-Scale Graph Building
- Example Projects in Graph Mining
- My Implementation: Louvian Algorithm for Community Detection in Large Graph NetworksURL
- https://github.com/sayedum/spark-implementation-louvian-modularity.git.
This is a private repository. I might give access to it to selected participants. You have to request for it. I need to know what will you do with it. This utilized PySpark, Spark GraphFrames on Hadoop Platforms. There is a non-spark, non-parallel, Python implementation as well. With some extensions and trying to answer the right question - this has the potential to become a research publication as well. - GraphX Implementation of Louvian Modularity Algorithm for Community DetectionURL
- https://github.com/Sotera/spark-distributed-louvain-modularity.Not my implementation. GraphX is kind of older than GraphFrame.
- PageRank Algorithm ImplementationURL
- https://www.geeksforgeeks.org/page-rank-algorithm-implementation/ . Not my implementation. I might share code blocks from my implementation.
- Centrality and Betweenness or similar concept implementation.URL
- https://www.geeksforgeeks.org/betweenness-centrality-centrality-measure/However, it will be the best, first you try to implement on your own. Better that you just don't memorize; however, try to earn the capability to convert textual concept/algorithm (mathematics) into code.
- Dijkstra’s shortest path algorithmURL
- https://www.geeksforgeeks.org/dijkstras-shortest-path-algorithm-greedy-algo-7/
- Floyd Warshall Algorithm | All pair Shortest PathURL
- https://www.geeksforgeeks.org/floyd-warshall-algorithm-dp-16/
- Karger's mincut algorithm in PythonURL
- Details: http://goatleaps.xyz/programming/kargers-algorithm.html Code File: http://goatleaps.xyz/assets/code/kargers_mincut.py
Shop Online: https://www.ShopForSoul.com/
8112223 Canada Inc./JustEtc: http://JustEtc.net
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Courses: http://Training.SitesTree.com (Big Data, Cloud, Security, Machine Learning)
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com