Skip to content

Commit e078a25

Browse files
authored
Merge pull request #14 from ebremer/develop
HDF5-backed BeakGraph
2 parents 63cc136 + bcb670a commit e078a25

File tree

184 files changed

+22041
-4474
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

184 files changed

+22041
-4474
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
/target/
1+
/target/

LICENSE

Lines changed: 201 additions & 201 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 62 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,68 @@
66
title="BeakGraph"
77
style="display: inline-block; margin: 0 auto; max-width: 150px">
88

9-
BeakGraph is a [Apache Jena](https://jena.apache.org/) Graph implementation backed by [Apache Arrow](https://arrow.apache.org/)
9+
## Building
10+
11+
Configuration file generation
12+
```
13+
java -Xmx16G -agentlib:native-image-agent=config-output-dir=src\main\resources\META-INF\native-image -jar target\BeakGraph-0.10.0.jar
14+
```
15+
Native Command-line
16+
```
17+
mvn -Pcmdlinenative clean package
18+
```
19+
Jar Command-line
20+
```
21+
mvn -Pcmdlinejar clean package
22+
```
23+
Core Library Jar Library
24+
```
25+
mvn -Plib clean package
26+
```
27+
28+
## Using BreakGraph in your code
29+
30+
### Creating a BeakGraph from your data
31+
```
32+
HDF5Writer.Builder()
33+
.setSource(file)
34+
.setSpatial(true) # only needed if GeoSPARQL spatial data is present
35+
.setDestination(dest)
36+
.build()
37+
.write();
38+
```
39+
40+
### Using and BeakGraph in your Apache Jena
41+
```
42+
try (HDF5Reader reader = new HDF5Reader(dest)) {
43+
BeakGraph bg = new BeakGraph( reader );
44+
Dataset ds = bg.getDataset();
45+
ds.getDefaultModel().write(System.out, "NTRIPLE");
46+
}
47+
```
48+
49+
BeakGraph is a [Apache Jena](https://jena.apache.org/) Graph implementation backed by [HDF5](https://www.hdfgroup.org/solutions/hdf5/).
50+
Beakgraph's HDF5 design is heavily inspired by [HDT](https://www.rdfhdt.org/).
51+
52+
### Author's notes
53+
The first iteration of BeakGraph was backed by Apache Arrow instead of [HDF5](https://www.hdfgroup.org/solutions/hdf5/). An Apache Arrow version will return. Reasons for this are varied with some of these reasons being just experimentation.
54+
The general idea of BeakGraph is a read-only, searchable, indexed set of binary [sussinct data structures](https://en.wikipedia.org/wiki/Succinct_data_structure) to represent an [RDF Dataset](https://www.w3.org/TR/rdf11-datasets/).
55+
What these sussinct data structures are stored in, is somewhat immaterial, but the choice of container has it's pro and cons. HDF5 treats multi-dimensional arrays as first class citizens, and has a free viewer for
56+
HDF5 files called [HDFView](https://www.hdfgroup.org/download-hdfview/). HDFView providing a nice way to debug the sussinct data structures during development. There are other perks to HDF5 which will become apparent in time.
57+
58+
Support for spatial indexing based on [GeoSPARQL](https://github.com/opengeospatial/ogc-geosparql) is being worked on.
59+
60+
The full list of containers under consideration are:
61+
* [HDF5](https://www.hdfgroup.org/solutions/hdf5/)
62+
* [Apache Arrow](https://arrow.apache.org/)
63+
* [Zarr](https://zarr.dev/)
64+
* [Zip](https://en.wikipedia.org/wiki/ZIP_(file_format))
65+
* [DICOM](https://www.dicomstandard.org/)
66+
* [LWS](https://github.com/w3c/lws-protocol)
67+
68+
69+
### Historical
70+
The original BeakGraph was an [Apache Jena](https://jena.apache.org/) Graph implementation backed by [Apache Arrow](https://arrow.apache.org/)
1071
wrapped in a [Research Object Crate (RO-Crate)](https://www.researchobject.org/ro-crate/) inspired by [HDT](https://www.rdfhdt.org/).
1172

1273
Developed to power [Halcyon](https://github.com/halcyon-project/Halcyon). See [Arxiv](https://arxiv.org/) paper at http://arxiv.org/abs/2304.10612

dependency-reduced-pom.xml

Lines changed: 254 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,254 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
3+
<modelVersion>4.0.0</modelVersion>
4+
<groupId>com.ebremer</groupId>
5+
<artifactId>BeakGraph</artifactId>
6+
<name>BeakGraph</name>
7+
<version>0.10.0</version>
8+
<description>Library for creating indexed binary storages for Apache Jena Graphs and Datasets</description>
9+
<issueManagement>
10+
<system>github</system>
11+
<url>https://github.com/ebremer/BeakGraph/issues</url>
12+
</issueManagement>
13+
<inceptionYear>2021</inceptionYear>
14+
<developers>
15+
<developer>
16+
<id>ebremer</id>
17+
<name>Erich Bremer</name>
18+
<email>[email protected]</email>
19+
<roles>
20+
<role>author</role>
21+
</roles>
22+
</developer>
23+
</developers>
24+
<licenses>
25+
<license>
26+
<name>The Apache Software License, Version 2.0</name>
27+
<url>https://www.apache.org/licenses/LICENSE-2.0.txt</url>
28+
</license>
29+
</licenses>
30+
<build>
31+
<plugins>
32+
<plugin>
33+
<artifactId>maven-dependency-plugin</artifactId>
34+
<version>${maven-dependency-plugin.version}</version>
35+
</plugin>
36+
<plugin>
37+
<artifactId>maven-compiler-plugin</artifactId>
38+
<version>3.14.1</version>
39+
<configuration>
40+
<release>25</release>
41+
<fork>true</fork>
42+
<compilerArgs>
43+
<arg>--enable-preview</arg>
44+
</compilerArgs>
45+
<compilerArgument>--enable-preview</compilerArgument>
46+
<executable>${java.home}/bin/javac</executable>
47+
</configuration>
48+
</plugin>
49+
</plugins>
50+
</build>
51+
<profiles>
52+
<profile>
53+
<id>lib</id>
54+
<build>
55+
<finalName>BeakGraph-${project.version}</finalName>
56+
<plugins>
57+
<plugin>
58+
<artifactId>maven-shade-plugin</artifactId>
59+
<version>${maven-shade-plugin.version}</version>
60+
<executions>
61+
<execution>
62+
<phase>package</phase>
63+
<goals>
64+
<goal>shade</goal>
65+
</goals>
66+
</execution>
67+
</executions>
68+
<configuration>
69+
<shadedArtifactAttached>false</shadedArtifactAttached>
70+
<transformers>
71+
<transformer>
72+
<manifestEntries>
73+
<Multi-Release>true</Multi-Release>
74+
</manifestEntries>
75+
</transformer>
76+
<transformer />
77+
</transformers>
78+
<filters>
79+
<filter>
80+
<artifact>*:*</artifact>
81+
<excludes>
82+
<exclude>META-INF/*.SF</exclude>
83+
<exclude>META-INF/*.DSA</exclude>
84+
<exclude>META-INF/*.RSA</exclude>
85+
</excludes>
86+
</filter>
87+
</filters>
88+
</configuration>
89+
</plugin>
90+
<plugin>
91+
<artifactId>maven-enforcer-plugin</artifactId>
92+
<version>${maven-enforcer-plugin.version}</version>
93+
<executions>
94+
<execution>
95+
<id>enforce</id>
96+
<goals>
97+
<goal>enforce</goal>
98+
</goals>
99+
<configuration>
100+
<rules>
101+
<banDuplicatePomDependencyVersions />
102+
</rules>
103+
</configuration>
104+
</execution>
105+
</executions>
106+
</plugin>
107+
</plugins>
108+
</build>
109+
</profile>
110+
<profile>
111+
<id>cmdlinenative</id>
112+
<build>
113+
<plugins>
114+
<plugin>
115+
<groupId>org.graalvm.buildtools</groupId>
116+
<artifactId>native-maven-plugin</artifactId>
117+
<version>${native-maven-plugin.version}</version>
118+
<extensions>true</extensions>
119+
<executions>
120+
<execution>
121+
<id>build-native</id>
122+
<phase>package</phase>
123+
<goals>
124+
<goal>build</goal>
125+
</goals>
126+
</execution>
127+
</executions>
128+
<configuration>
129+
<imageName>beakgraph</imageName>
130+
<mainClass>com.ebremer.beakgraph.cmdline.beakgraph</mainClass>
131+
<debug>true</debug>
132+
<skipNativeTests>true</skipNativeTests>
133+
<verbose>true</verbose>
134+
<buildArgs>
135+
<buildArg>--no-fallback</buildArg>
136+
<buildArg>--enable-url-protocols=https</buildArg>
137+
<buildArg>-H:+AddAllCharsets</buildArg>
138+
<buildArg>-march=native</buildArg>
139+
<buildArg>--no-server</buildArg>
140+
<buildArg>-J-Xmx32G</buildArg>
141+
</buildArgs>
142+
</configuration>
143+
</plugin>
144+
</plugins>
145+
</build>
146+
</profile>
147+
<profile>
148+
<id>cmdlinejar</id>
149+
<build>
150+
<plugins>
151+
<plugin>
152+
<artifactId>maven-shade-plugin</artifactId>
153+
<version>${maven-shade-plugin.version}</version>
154+
<executions>
155+
<execution>
156+
<phase>package</phase>
157+
<goals>
158+
<goal>shade</goal>
159+
</goals>
160+
</execution>
161+
</executions>
162+
<configuration>
163+
<shadedArtifactAttached>false</shadedArtifactAttached>
164+
<transformers>
165+
<transformer>
166+
<mainClass>com.ebremer.beakgraph.BG</mainClass>
167+
<manifestEntries>
168+
<Multi-Release>true</Multi-Release>
169+
</manifestEntries>
170+
</transformer>
171+
<transformer />
172+
</transformers>
173+
<filters>
174+
<filter>
175+
<artifact>*:*</artifact>
176+
<excludes>
177+
<exclude>META-INF/*.SF</exclude>
178+
<exclude>META-INF/*.DSA</exclude>
179+
<exclude>META-INF/*.RSA</exclude>
180+
</excludes>
181+
</filter>
182+
</filters>
183+
</configuration>
184+
</plugin>
185+
</plugins>
186+
</build>
187+
</profile>
188+
</profiles>
189+
<repositories>
190+
<repository>
191+
<id>central</id>
192+
<name>Maven Central</name>
193+
<url>https://repo1.maven.org/maven2</url>
194+
</repository>
195+
<repository>
196+
<id>halcyon</id>
197+
<name>Halcyon</name>
198+
<url>https://cursus.bmi.stonybrookmedicine.edu/releases</url>
199+
</repository>
200+
<repository>
201+
<id>apache</id>
202+
<name>apache</name>
203+
<url>https://repository.apache.org/snapshots</url>
204+
</repository>
205+
<repository>
206+
<id>adatao-releases</id>
207+
<url>https://raw.githubusercontent.com/adatao/mvnrepos/master/releases/</url>
208+
</repository>
209+
<repository>
210+
<id>unidata-all</id>
211+
<name>Unidata All</name>
212+
<url>https://artifacts.unidata.ucar.edu/repository/unidata-all/</url>
213+
</repository>
214+
</repositories>
215+
<dependencies>
216+
<dependency>
217+
<groupId>org.apache.jena</groupId>
218+
<artifactId>apache-jena-libs</artifactId>
219+
<version>5.6.0</version>
220+
<type>pom</type>
221+
<scope>compile</scope>
222+
</dependency>
223+
</dependencies>
224+
<distributionManagement>
225+
<repository>
226+
<id>halcyon</id>
227+
<url>https://cursus.bmi.stonybrookmedicine.edu/releases</url>
228+
</repository>
229+
</distributionManagement>
230+
<properties>
231+
<commons-collections.version>4.5.0</commons-collections.version>
232+
<exec.mainClass>com.ebremer.beakgraph.BeakGraph</exec.mainClass>
233+
<java.version>25</java.version>
234+
<maven-shade-plugin.version>3.6.1</maven-shade-plugin.version>
235+
<jena.version>5.6.0</jena.version>
236+
<maven-compiler-plugin.version>3.14.1</maven-compiler-plugin.version>
237+
<rocrate4j.version>0.5.0</rocrate4j.version>
238+
<jetty.version>11.0.26</jetty.version>
239+
<jaxb.version>2.4.0-b180830.0359</jaxb.version>
240+
<commons-compress.version>1.28.0</commons-compress.version>
241+
<jts.ver>1.20.0</jts.ver>
242+
<maven-dependency-plugin.version>3.9.0</maven-dependency-plugin.version>
243+
<native-maven-plugin.version>0.11.3</native-maven-plugin.version>
244+
<maven-enforcer-plugin.version>3.6.2</maven-enforcer-plugin.version>
245+
<slf4j.version>2.0.17</slf4j.version>
246+
<hilbert.ver>0.2.3</hilbert.ver>
247+
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
248+
<arrow.version>18.3.0</arrow.version>
249+
<log4j.version>2.25.2</log4j.version>
250+
<jcommander.version>3.0</jcommander.version>
251+
<unidata.version>5.9.1</unidata.version>
252+
<jhdf.version>0.10.0</jhdf.version>
253+
</properties>
254+
</project>

0 commit comments

Comments
 (0)