fix(server): keep schema ~create_time userdata as Date after reload#3026
fix(server): keep schema ~create_time userdata as Date after reload#3026dpol1 wants to merge 5 commits into
Conversation
Normalize server-side schema ~create_time userdata in SchemaElement so serializer reloads and fromMap paths keep the Date contract. Add SchemaElement, TextSerializer, and BinarySerializer coverage. Closes apache#3013
There was a problem hiding this comment.
Pull request overview
This PR fixes a schema reload regression where internal schema userdata Userdata.CREATE_TIME ("~create_time") can round-trip through backend JSON serializers as a String and come back as String after reload, violating the server-side contract that it should be a java.util.Date.
Changes:
- Added
SchemaElement.normalizeUserdataValue(key, value)to re-type~create_timestring values back intoDate(and throw a clearer error on invalid values). - Routed both
SchemaElement.userdata(String, Object)andSchemaElement.userdata(Userdata)through the normalization logic (and added a null-argument check for the bulk setter). - Added regression tests for normalization behavior and for Text/Binary serializer round-trips, and registered them in
UnitTestSuite.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/schema/SchemaElement.java | Normalizes ~create_time userdata on set (single and bulk), ensuring reload paths restore Date instead of String. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/core/SchemaElementTest.java | Unit coverage for normalization behavior, idempotency, invalid strings, and null bulk userdata validation. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/serializer/TextSerializerTest.java | Regression test verifying TextSerializer schema round-trip keeps CREATE_TIME as Date. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/serializer/BinarySerializerTest.java | Regression test verifying BinarySerializer schema round-trip keeps CREATE_TIME as Date. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/UnitTestSuite.java | Registers the new unit tests in the suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
imbajin
left a comment
There was a problem hiding this comment.
Reviewed correctness, design, and tests. No blocking issues — fix is correct and narrowly scoped, all backend reload paths (Text / Binary / Mysql / Cassandra / HBase / Postgres / Palo / Hstore + the four fromMap paths) route through the new helper, and DateUtil.parse is symmetric with the "yyyy-MM-dd HH:mm:ss.SSS" format used by JsonUtil/HugeGraphSONModule. Inline comments cover important refinements (helper placement, public setter docs, test depth). A few additional minor notes below that don't have a clean inline target:
- 🧹
removeUserdata(Userdata)(SchemaElement.java:114-118) lacks a null check — addingE.checkArgumentNotNull(userdata, ...)keeps the bulk setters symmetric with the new check onuserdata(Userdata). - 🧹 Behavior change worth a release note: any REST client that posts
~create_timeas a JSON string in user-data now silently receives aDate(or anIllegalArgumentExceptionif malformed).~create_timeis a server-reserved key (~prefix, written bySchemaTransaction.setCreateTimeIfNeeded), so risk is small, but flagging it in the PR body / changelog is helpful. - 🧹 Follow-up tracking:
Userdata.DEFAULT_VALUEhas the same Jackson type-erasure problem (e.g., a property key whose data type isDatewill reload its default value asString). The PR description already defers thehugegraph-structmirror — please addDEFAULT_VALUEto the same follow-up list, since the root cause and fix shape are identical.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3026 +/- ##
============================================
- Coverage 35.85% 29.71% -6.14%
+ Complexity 338 264 -74
============================================
Files 802 803 +1
Lines 67995 68052 +57
Branches 8902 8908 +6
============================================
- Hits 24381 20225 -4156
- Misses 41008 45506 +4498
+ Partials 2606 2321 -285 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
imbajin
left a comment
There was a problem hiding this comment.
No blocking issues. The core fix (normalizing CREATE_TIME String→Date on deserialization) is correct and the main round-trip paths are tested. Four non-blocking observations below.
| "was " + (value == null ? "null" : value.getClass()), | ||
| value instanceof Date); | ||
| Assert.assertEquals(created, value); | ||
| } |
There was a problem hiding this comment.
MysqlSerializer and CassandraSerializer are also affected by the same CREATE_TIME deserialization gap:
"
userdata(String, Object)covers serializer reload paths such as TextSerializer, BinarySerializer, MysqlSerializer, and CassandraSerializer"
Only TextSerializer and BinarySerializer have round-trip tests added. Adding equivalent testPropertyKeyUserdataCreateTimeRoundTripsAsDate tests for MysqlSerializer and CassandraSerializer would close this coverage gap for the stated fix surface.
There was a problem hiding this comment.
Holding off, with reasoning:
MysqlSerializer.readUserdata (L167) and CassandraSerializer.readUserdata
(L222) both call the same schema.userdata(key, value) setter as
TextSerializer/BinarySerializer. That setter now routes through the
single Userdata.put normalization path, which is already exercised by:
TextSerializerTest/BinarySerializerTestround-tripsSchemaElementTest(constructor, single setter, bulk,fromMap)
There's no existing unit-test harness for the SQL serializers — adding one
means backend/mock infra for zero new code paths. The marginal coverage
doesn't justify it.
Better closure for the shared logic: a direct UserdataTest.normalizeValue
unit test, serializer-agnostic. Happy to add that as a follow-up; proposing
not to gate this PR on it. Pushback welcome if you see a path difference
I'm missing.
| "was " + (value == null ? "null" : value.getClass()), | ||
| value instanceof Date); | ||
| Assert.assertEquals(created, value); | ||
| } |
There was a problem hiding this comment.
PropertyKey is round-tripped here. VertexLabel, EdgeLabel, and IndexLabel each have their own writeXxx/readXxx methods and reach the normalization through the bulk setter (fromMap → userdata(Userdata)), a different path from PropertyKey (single-key setter). A regression in those paths would not be caught by the existing tests.
Would it be possible to add at minimum one VertexLabel round-trip test here as the representative bulk-setter path?
There was a problem hiding this comment.
@imbajin not quite — readUserdata is a single shared method (L909).
readVertexLabel/readEdgeLabel/readPropertyKey/readIndexLabel all call it, and it uses the single-key setter for every type. So a VertexLabel round-trip here runs the exact same path as PropertyKey, not the bulk one.
The bulk path (fromMap → userdata(Userdata)) is the schema-builder side, already pinned by SchemaElementTest#testVertexLabelFromMapNormalizesCreateTimeString. And after this PR both setters go through Userdata.put, so it's one normalization point. IMHO a VertexLabel serializer test would just re-run the identical loop, no new path covered. I'll still add one if you want the readVertexLabel wiring guarded — just say so.
There was a problem hiding this comment.
Good point, I agree that the serializer round-trip path is shared here. A VertexLabel serializer round-trip would mostly re-run the same readUserdata() loop, so I do not think we need to insist on that test.
I found a separate edge case while re-checking the current implementation: moving normalization into Userdata.put() also affects eliminate inputs, where the value is only a placeholder and removeUserdata(Userdata) deletes by key.
schema.xxx(...)
.userdata(Userdata.CREATE_TIME, "")
.eliminate()
|
v
Userdata.put("~create_time", "")
|
v
DateUtil.parse("") -> IllegalArgumentException
|
v
never reaches removeUserdata(...)
Existing eliminate tests use placeholder values like .userdata("icon", "").eliminate() / .userdata("max", "").eliminate(), so this seems like a real behavior change for deleting ~create_time via the builder or REST remove action.
Could we either skip normalization for eliminate/delete inputs, or explicitly reject removal of Userdata.CREATE_TIME with a dedicated check? Right now it fails as a date parse error before the eliminate path can apply its key-only semantics.
close [Bug] schema ~create_time userdata round-trips Date as String after cache reload #3013
Userdata.CREATE_TIME("~create_time") is written asjava.util.Dateby
SchemaTransaction.setCreateTimeIfNeeded(), then persisted through backendserializers as JSON. On reload, userdata is deserialized through raw
Map.classpaths, so Jackson cannot restore the value type andschemaElement.userdata().get(Userdata.CREATE_TIME)can come back asString.That breaks callers and existing tests that expect
~create_timeto keep theserver-side
Datecontract after a schema is reloaded from backend storage.Main Changes
Added a single
SchemaElement.normalizeUserdataValue(key, value)helper.Userdata.CREATE_TIMEstring values back withDateUtil.parse(...).Datevalues and all other userdata keys unchanged.IllegalArgumentExceptionif a~create_timestring is invalid.Routed both userdata setters through the helper:
userdata(String, Object)covers serializer reload paths such asTextSerializer,BinarySerializer,MysqlSerializer, andCassandraSerializer.userdata(Userdata)covers schemafromMap()hydration paths forPropertyKey,VertexLabel,EdgeLabel, andIndexLabel.Added a null check to
userdata(Userdata)for consistent argument validation.Added regression coverage for schema userdata normalization and serializer
round trips.
This PR is intentionally limited to the server-side
SchemaElement. The mirrorclass in
hugegraph-structhas a similar userdata shape and can be handled in aseparate follow-up PR.
Verifying these changes
SchemaElementTestcovers single-key normalization, bulk normalization,idempotency for
Date, invalid~create_timestrings, null bulk userdata,and non-
CREATE_TIMEkeys.TextSerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDateverifies the text serializer schema round trip keeps
CREATE_TIMEasDate.BinarySerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDateverifies the binary serializer schema round trip keeps
CREATE_TIMEasDate.UnitTestSuite.Does this PR potentially affect the following parts?
The public SchemaElement.userdata(...) signatures are unchanged. The behavior
change is limited to internal userdata normalization for Userdata.CREATE_TIME.
Documentation Status
Notes
Userdata.CREATE_TIME(~create_time) only.~create_timeis a server-reserved userdata key; valid string values are normalized back tojava.util.Date, while malformed values now fail withIllegalArgumentException.Userdata.DEFAULT_VALUE(~default_value) reloads and checking the same raw userdata deserialization gap inhugegraph-struct.