-
-
Notifications
You must be signed in to change notification settings - Fork 35
Added functionality of parallel maximal independent set #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pnijhara
wants to merge
19
commits into
networkx:main
Choose a base branch
from
pnijhara:parallel-mis
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+310
−9
Open
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
e3d65be
Added functionality of parallel mis
pnijhara 562c725
Fix get chunk tests
pnijhara 393ff44
Implemented custom sequential mis instead of nx.mis for node count le…
pnijhara 6c94be7
Remove unnecessary docstrings
pnijhara 375f0b1
Remove unnecessary docstrings
pnijhara dacab14
Remove assert messages and docstrings from mis tests
pnijhara c592668
Add NetworkX fallback with custom input node count
pnijhara 0a73d7b
Replace nx mis unwrap method with direct call
pnijhara e76f9ef
Fix suggest review changes in mis.py
pnijhara 31770ff
Fix suggested review changes in utils and policies
pnijhara 9179f1b
Remove unused typing.Any import from mis.py
pnijhara 6810dee
Refactor MIS tests to focus on nx-parallel specific aspects
pnijhara 960b0b3
Simplify NetworkX MIS import using nx.maximal_independent_set.orig_func
pnijhara f0a9aec
Remove redundant should_run check and unused _nx_mis import
pnijhara 8ba51ac
Fix style issues for line length and quote consistency
pnijhara 2abc98b
Add final pass to ensure MIS maximality after parallel merge
pnijhara 94e5537
Add maximal_independent_set to get_info
pnijhara b0641dd
Add on_start_tests to mark MIS order test as xfail
pnijhara c735307
Merge remote-tracking branch 'upstream/main' into parallel-mis
pnijhara File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,3 +15,4 @@ | |
| from .cluster import * | ||
| from .link_prediction import * | ||
| from .dag import * | ||
| from .mis import * | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,189 @@ | ||
| from joblib import Parallel, delayed | ||
| import nx_parallel as nxp | ||
| import networkx as nx | ||
|
|
||
| __all__ = ["maximal_independent_set"] | ||
|
|
||
|
|
||
| @nxp._configure_if_nx_active(should_run=nxp.should_run_if_large(50000)) | ||
| def maximal_independent_set(G, nodes=None, seed=None, get_chunks="chunks"): | ||
| """Returns a random maximal independent set guaranteed to contain | ||
| a given set of nodes. | ||
|
|
||
| This parallel implementation processes nodes in chunks across multiple | ||
| cores, using a Luby-style randomized parallel algorithm for speedup | ||
| on large graphs. | ||
|
|
||
| An independent set is a set of nodes such that the subgraph | ||
| of G induced by these nodes contains no edges. A maximal | ||
| independent set is an independent set such that it is not possible | ||
| to add a new node and still get an independent set. | ||
|
|
||
| The parallel computation divides nodes into chunks and processes them | ||
| in parallel, iteratively building the independent set faster than | ||
| sequential processing on large graphs. | ||
|
|
||
| networkx.maximal_independent_set: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.mis.maximal_independent_set.html | ||
|
|
||
| Parameters | ||
| ---------- | ||
| G : NetworkX graph | ||
| An undirected graph. | ||
|
|
||
| nodes : list or iterable, optional | ||
| Nodes that must be part of the independent set. This set of nodes | ||
| must be independent. If not provided, a random starting node is chosen. | ||
|
|
||
| seed : integer, random_state, or None (default) | ||
| Indicator of random number generation state. | ||
| See :ref:`Randomness<randomness>`. | ||
|
|
||
| get_chunks : str, function (default = "chunks") | ||
| A function that takes in a list of nodes and returns chunks. | ||
| The default chunking divides nodes into n_jobs chunks. | ||
|
|
||
| Returns | ||
| ------- | ||
| indep_nodes : list | ||
| List of nodes that are part of a maximal independent set. | ||
|
|
||
| Raises | ||
| ------ | ||
| NetworkXUnfeasible | ||
| If the nodes in the provided list are not part of the graph or | ||
| do not form an independent set, an exception is raised. | ||
|
|
||
| NetworkXNotImplemented | ||
| If `G` is directed. | ||
|
|
||
| Examples | ||
| -------- | ||
| >>> import networkx as nx | ||
| >>> import nx_parallel as nxp | ||
| >>> G = nx.path_graph(5) | ||
| >>> nxp.maximal_independent_set(G) # doctest: +SKIP | ||
| [4, 0, 2] | ||
| >>> nxp.maximal_independent_set(G, [1]) # doctest: +SKIP | ||
| [1, 3] | ||
|
|
||
| Notes | ||
| ----- | ||
| This algorithm does not solve the maximum independent set problem. | ||
| The parallel version uses a chunk-based parallel algorithm that | ||
| provides speedup on large graphs (>= 50000 nodes). For smaller graphs, | ||
| the NetworkX sequential version is used automatically. | ||
|
|
||
| """ | ||
| if hasattr(G, "graph_object"): | ||
| G = G.graph_object | ||
|
|
||
| # Validate directed graph | ||
| if G.is_directed(): | ||
| raise nx.NetworkXNotImplemented( | ||
| "NX-PARALLEL: Not implemented for directed graphs." | ||
| ) | ||
|
|
||
| # Note: When called through nx.maximal_independent_set with backend="parallel", | ||
| # the @py_random_state(2) decorator in NetworkX runs BEFORE @_dispatchable, | ||
| # so seed is already a Random object by the time it reaches this backend function. | ||
| # However, keeping this conversion for defensive purposes in case this function | ||
| # is called directly via nxp.maximal_independent_set(). | ||
| import random | ||
|
|
||
| if seed is not None and hasattr(seed, "random"): | ||
| rng = seed | ||
| elif seed is not None: | ||
| rng = random.Random(seed) | ||
| else: | ||
| rng = random._inst | ||
|
|
||
| # Validate nodes parameter | ||
| if nodes is not None: | ||
| nodes_set = set(nodes) | ||
| if not nodes_set.issubset(G): | ||
| raise nx.NetworkXUnfeasible(f"{nodes} is not a subset of the nodes of G") | ||
| neighbors = ( | ||
| set.union(*[set(G.adj[v]) for v in nodes_set]) if nodes_set else set() | ||
| ) | ||
| if set.intersection(neighbors, nodes_set): | ||
| raise nx.NetworkXUnfeasible(f"{nodes} is not an independent set of G") | ||
| else: | ||
| nodes_set = set() | ||
|
|
||
| n_jobs = nxp.get_n_jobs() | ||
|
|
||
| # Parallel strategy: Run complete MIS algorithm on node chunks independently | ||
| all_nodes = list(G) | ||
|
|
||
| # Remove required nodes and their neighbors from consideration | ||
| if nodes_set: | ||
| available = set(all_nodes) - nodes_set | ||
| for node in nodes_set: | ||
| available.difference_update(G.neighbors(node)) | ||
| available = list(available) | ||
| else: | ||
| available = all_nodes | ||
|
|
||
| # Shuffle for randomness | ||
| rng.shuffle(available) | ||
|
|
||
| # Split into chunks | ||
| if get_chunks == "chunks": | ||
| chunks = list(nxp.chunks(available, n_jobs)) | ||
| else: | ||
| chunks = list(get_chunks(available)) | ||
|
|
||
| # Precompute adjacency | ||
| adj_dict = {node: set(G.neighbors(node)) for node in G.nodes()} | ||
|
|
||
| def _process_chunk_independent(chunk, chunk_seed): | ||
| """Process chunk completely independently - build local MIS.""" | ||
| local_rng = random.Random(chunk_seed) | ||
| local_mis = [] | ||
| local_excluded = set() | ||
|
|
||
| # Shuffle chunk for randomness | ||
| chunk_list = list(chunk) | ||
| local_rng.shuffle(chunk_list) | ||
|
|
||
| for node in chunk_list: | ||
| if node not in local_excluded: | ||
| # Add to MIS | ||
| local_mis.append(node) | ||
| local_excluded.add(node) | ||
| # Mark neighbors as excluded (only within this chunk) | ||
| for neighbor in adj_dict[node]: | ||
| if neighbor in chunk_list: | ||
| local_excluded.add(neighbor) | ||
pnijhara marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| return local_mis | ||
|
|
||
| # Generate seeds for each chunk | ||
| chunk_seeds = [rng.randint(0, 2**31 - 1) for _ in range(len(chunks))] | ||
|
|
||
| # Process chunks in parallel | ||
| results = Parallel()( | ||
| delayed(_process_chunk_independent)(chunk, chunk_seeds[i]) | ||
| for i, chunk in enumerate(chunks) | ||
| ) | ||
|
|
||
| # Merge results: resolve conflicts between chunks | ||
| indep_set = list(nodes_set) if nodes_set else [] | ||
| excluded = set(all_nodes) - set(available) if nodes_set else set() | ||
pnijhara marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Process results in order, greedily adding non-conflicting nodes | ||
| for local_mis in results: | ||
| for node in local_mis: | ||
| if node not in excluded: | ||
| indep_set.append(node) | ||
| excluded.add(node) | ||
| excluded.update(adj_dict[node]) | ||
|
|
||
| # Final pass: ensure maximality by adding any remaining available nodes | ||
| for node in available: | ||
| if node not in excluded: | ||
| indep_set.append(node) | ||
| excluded.add(node) | ||
| excluded.update(adj_dict[node]) | ||
|
|
||
| return indep_set | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| import networkx as nx | ||
| import nx_parallel as nxp | ||
|
|
||
|
|
||
| def test_should_run_small_graph(): | ||
| """Small graphs should fall back to NetworkX sequential implementation.""" | ||
| G = nx.fast_gnp_random_graph(100, 0.1, seed=42) | ||
| H = nxp.ParallelGraph(G) | ||
|
|
||
| result = nxp.maximal_independent_set.should_run(H) | ||
| assert result == "Graph too small for parallel execution" | ||
|
|
||
|
|
||
| def test_should_run_large_graph(): | ||
| """Large graphs should use the parallel implementation.""" | ||
| G = nx.fast_gnp_random_graph(60000, 0.0001, seed=42) | ||
| H = nxp.ParallelGraph(G) | ||
pnijhara marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| result = nxp.maximal_independent_set.should_run(H) | ||
| assert result is True | ||
|
|
||
|
|
||
| def test_get_chunks(): | ||
| """Test custom chunking function.""" | ||
| G = nx.fast_gnp_random_graph(60000, 0.0001, seed=42) | ||
| H = nxp.ParallelGraph(G) | ||
|
|
||
| def custom_chunks(nodes): | ||
| nodes_list = list(nodes) | ||
| mid = len(nodes_list) // 2 | ||
| return [nodes_list[:mid], nodes_list[mid:]] | ||
|
|
||
| result1 = nxp.maximal_independent_set(H, seed=42) | ||
| result2 = nxp.maximal_independent_set(H, seed=42, get_chunks=custom_chunks) | ||
|
|
||
| # Both should be valid independent sets (correctness is tested by NetworkX) | ||
| for result in [result1, result2]: | ||
| result_set = set(result) | ||
| for node in result: | ||
| neighbors = set(G.neighbors(node)) | ||
| assert not result_set.intersection(neighbors) | ||
|
|
||
|
|
||
| def test_parallel_deterministic_with_seed(): | ||
| """Parallel execution with same seed should produce same result.""" | ||
| G = nx.fast_gnp_random_graph(60000, 0.0001, seed=42) | ||
| H = nxp.ParallelGraph(G) | ||
|
|
||
| result1 = nxp.maximal_independent_set(H, seed=42) | ||
| result2 = nxp.maximal_independent_set(H, seed=42) | ||
|
|
||
| assert result1 == result2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.