Install & Run: http://dev.grakn.ai/docs/running-grakn/install-and-run
New Features
-
Selective locking on commit.
When loading concurrently, we want to minimise the amount of locking needed to improve performance whilst retaining correctness and consistency.
Currently we lock the graph if there are any attribute changes in any of the committing transactions. This in practice means that we lock practically always and all the time for mixed loads. This is however very defensive and not required as the only situations that require locking on commit are when two transactions share an inserted attribute or want to create a shard vertex for the same type.
To be able to lock more effectively, we need to be able to a) lock selectively based on the creation of identical attributes among different transaction, b) lock selectively based on the creation of shards of the same type among different transaction. Hence we introduce a basic selective locking behaviour. -
Replaced thrift storage driver with CQL.
Moved from the deprecated Thrift protocol to the latest CQL native protocol to communicate to Cassandra. -
Retrievable Explanations.
Large explanations can break gRPC message limits, specifically nesting depth when returning deep explanation trees. With this change, each layer of the explanation tree can be retrieved from the server separately. This reflects the changes in the protocol here: vaticle/typedb-protocol#20.
We retain the storage of Explanation trees on the Server, and expose a new RPC message with corresponding handler on the server to retrieve particular layers of the tree using the query pattern provided by the client. The query pattern is used to access the query cache directly.
Additionally,pattern
inConceptMap
s now contain the IDs indicated in the actual concept map as well.
Finishes #4605, and #4545 -
Integrate checking for unused Java dependencies.
Integrate checking for unused Java dependencies -
Statistics for MetaTypes and Modify
compute count;
behavior.
As noted in #5421, we aren't currently caching statistics for Entity, Relation and Attribute like we do forThing
. This means that since we answercompute count in thing;
or any concrete label using statistics, we can now also answercompute count in entity/relation/attribute;
the same way instead of having to execute thecompute count
via OLAP fully.
Bugs Fixed
-
Fix key enforcement when dealing with concurrent txs.
When concurrent Grakn transactions are used, it is possible to insert two instances with the same type and key, which violates key uniqueness. We ensure uniqueness by doing graph validation after attribute merge and locking on commit if key relations are present within the transaction. -
Tag dependency for @graknlabs_grakn_console.
Previously,grakn-core-all
depended on a snapshot version ofgrakn-console
which could not be found in release repo. This situation can be prevented by enforcing dependency on tag -
AtomicBase Equals Method Fix - Flaky Define test.
There is a define testGraqlDefineIT.testDefineIsAbstract()
which fails occasionally.
This PR fixes it by updating theequals()
method onAtomicBase
that led to overwriting elements when collecting the Atoms required by the query to a set.
Code Refactors
-
Merge JanusGraph into Grakn.
We are removing all the JanusGraph dependencies and including a customised sub-set of the full JanusGraph project into the Grakn repo.
This is because we need to apply more and more customisation to the graph representation
and optimise even more the use-cases for Grakn (and also we don't need all the features and classes that JG provides).
We are also moving from the deprecated Thrift protocol to the latest CQL native protocol to communicate to Cassandra. -
Replace unneeded Gremlin noop step with a cheaper alternative.
A minor but free performance enhancement forread
queries, in which we often have__.identity()
steps in the traversal. These are redundant (documented no-op in janus) and were generated by our need to create anonymous traversals that could be joined to existing ones. Instead, we can use__.start()
to obtain a new traversal. Since theidentity()
method actually iterates over all elements in the traversal (it appears, because it takes non-zero time) we want to avoid it. -
Dont start Cassandra Thrift server .
As we have moved to only use CQL, we don't need to start the Cassandra Thrift server anymore. -
Reuse driver.
We were still using the older version of Cassandra driver for Hadoop operations, this PR rewrites some classes so that we use the latest version everywhere -
Grpc threads optimisation.
- Don't let grpc threads grow unbounded: before it could happen that with a lot of concurrent grpc transactions the server would start creating a very big number of threads (1 per connection-request) instead of reusing already existing threads, specifically when handling Event Loop Group.
We know provide a Threadpool to GrpcServer so that threads will be reused following best practices described by Grpc devs. - Kill all grpc transaction threads when a client either closes a remote session or terminates connection abruptly: before when a client terminates a grpc connection or a specific grpc session we would just close the current Grpc session, we now track all the transaction threads accotiated to it and terminate those as well.
- Don't let grpc threads grow unbounded: before it could happen that with a lot of concurrent grpc transactions the server would start creating a very big number of threads (1 per connection-request) instead of reusing already existing threads, specifically when handling Event Loop Group.
-
Check for Java unused dependencies as part of
build
CI job.
Following up to #5478, we've decided to do check for unused dependencies inbuild
CI job -
Migrate utils to graknlabs/common.
We migrate some generally usefulutils
tograknlabs/common
repository from thegraknlabs/grakn/common
package. Correspondingly, if dependencies are correct, the package paths should change fromgrakn.core.common.util.Pair
tograkn.common.util.Pair
, for example.
Also adds the requirement to have atags
field onjava_library
dependencies ofserver
, with the missing ones filled in following our naming pattern.
Closes #5454 -
Remove Autovalues from Grakn Core.
We decided that AutoValues add an extra layer of indirection and make code more difficult to read and use, with small added benefits of automaticequals()
and immutablevalue
classes. This change removes them from out code base everywhere. -
Refactor how version is provided to deployment rules.
Fix vaticle/bazel-distribution#150 -
Rearchitect Grakn Packages.
Grakn server has a tighly coupled and very circular dependency structure, which is why the main BUILD file contains bothserver
andgraql
.
The primary goal is to splitserver
andgraql
into separate bazel packages and introduce a better story and flow of the code. We imagine that:
server
-- the implementation the Grakn server -- should receive RPC calls, instantiate sessions and transactions, required caches, etc. Queries are handed (via transactions) tograql
.
graql
-- the implementation of the Graql language -- can invoke its subpackages -reasoner
,executor
, which in turn may hitplanning
.
Finally, we receiveConcept
s andanswers
that are returned again via theserver
.
Getting to this flow involves introducing a set set of interfaces that reflect Grakn's overall architecture and capabilities. This in turn lets us break circular package dependencies between packages. This should discourage future circular dependencies, and encourage significantly better encapsulation by preventing us from over-exposing methods from implementations of classes to each other which was (and still is) prevalent. -
Cleanup Maven deployment targets and jobs.
vaticle/bazel-distribution#191 changed the way how version should be provided for Maven deployment. This PR adapts Grakn Core to latest changes. -
Client Java Cyclic Dependency.
Client-java no longer has code dependencies on Grakn Core. However, we still have interfaces ingrakn.core.api
that are no longer being used externally.
These are collapsed intoKeyspace
(previouslyKeyspaceImpl
),Session
(previouslySessionImpl
), andTransactionOLTP
classes and thegrakn.core.api
package is deleted.
We also synchronize all tests that depend on client-java's new API which are no longer identical the those ingrakn.core
-- notablegrakn.client.concept.api
and theTransaction
class on the client.
This change does NOT do any very large package refactors beyond those required. -
Disable janus consistency checks.
Janus does optional consistency checking to avoid duplicate edges and duplicate properties on vertices.
The checking behaviour in case of encountering already existing equivalents is to throw.
Those checks are redundant for us as Grakn already ensures uniqueness in these cases. -
Migrate test-assembly-windows-zip to CircleCI Windows.
Fix #5427; Fix #5408
Use native CircleCI Windows executor to assemble Grakn for Windows and run an application test -
Implement Custom Lazy Stream Merging.
Replace io.vavr (#5430) that was introduced to solve the lack of lazyflatMap
that is mentioned in #5430. We then determined that conversion to theio.vavr
Stream implementation is more lazy, but the conversion step itself does a single call to.next()
. In practical terms, this means Grakn always precomputed the first answer on query, which is still not fully lazy.
This led to the decision to restore prior behavior and solve the eagerness problem by implementing our own lazy stream merging solution.
Other Improvements
-
Use port 587 instead of the default 25 for sending release-notification.
We have used a more appropriate port587
for sending SMTP email. The default one (port25
) is throttled by default (reference article) and therefore should not use that. -
Adds the checkout step in release-notification.
release-notification
needs to do the checkout step in order to access the VERSION file.