Before using KG, the configuration must first be completed to choose the features and define the nodes & edges (relationships). A user can click the “KG Setting” button in the upper right corner of the KG main page to jump to the KG setup page for KG initialization. Please note that the setup option is only visible to admin users.
The initialization interface is shown in the image below. Next, we go through these steps one by one.
2.1 Setup Key Fields
The first step of KG initialization is the Setup Key Fields page as illustrated below. We need to config three fields: user ID, the description of user ID (display name), and display icon.
The Entity ID means the attribute representing the user node in the graph. Usually it is userID. If you don’t have the field user ID in the data, you can also use the main entity such as application ID, email ID etc.
The display name means what to display at the bottom of the user node (0007...9b4e in the below figure). It can be also configured using the same attribute as user ID, but it can also be a different attribute, e.g., username, full name.
The user node icon configuration is an option step. If not configured, each user node would have a default human icon as shown in the figure above. When configured, it can be used to add characters in the center of the user node as shown in the figure below. For example, suppose we have a user type attribute user_type recording whether this user is customer, employee, or vendor. Then when we use user_type in the “Choose Category” field, nodes in the graph would display the first letter of the attribute in the center of the user node, so we can easily know what type of the user node it is. In this example, it displays E in for the user that has an attribute employee in user_type and V for the vendor node.
2.2 Categorize Fields
After selecting the key fields, click Next to enter the Categorize Fields page as illustrated below. Label each field from the existing field list with Icon section by section. The corresponding display icon for the field node can be chosen by the user.
If the values are not manually selected for the above setting, the default value will be used. After completing this step of the settings, click Back or Next.
2.3 Add Node and Relation
After completing the steps for field classification setting, click Next to enter the Add Node and Relation page as illustrated below. The matching relationship fields must be added to the corresponding list values of Bind Relation, Transaction Relation, and Share Value Relation. Unnecessary relationship fields can be removed. After completing this step, click Next.
2.3.1 Bind relationship
The bind relationship denotes the strong relationship between entities. When investigating one user, all the other users sharing the bind relationship should be automatically displayed and investigated. Typical bind relationship features include
- IDs such as SSN, drivers license, password IDs
- Email address
- Home address
- Phone number
- Device ID
- Account numbers such as card number, bank account number
- IP address
2.3.2 Transaction relationship
Transaction relationships are used to represent the transaction relationship. It can be configured in two ways according to the input data.
- First one is a user node to an account number when the data has the transaction to account number, and account number is not user id. The KG graph will look like the following.
To configure such relationships, please select the either “Pay” or “Receive” option. Then for both “Transaction entity attribute” and “Entity Type”, select the account entity attribute. The amount feature selects the amount attribute.
Please note that Amount needs to be integer or double, can’t include special characters such as $ or ‘,’ . If your amount attribute has those non digit characters, please use the feature platform to remove all those special characters and create a new feature. Then you can come back to KG to configure the amount using this new feature.
The second way is to configure user to user transactions, when the data has transaction relationships between user ids. The KG graph will look like the following.
To configure such a transaction relationship, we need to select the attribute that represents the transaction destination information and also specify that the entity type is the same as the main user entity.
2.3.3 Share value relationship
Share value relationship is also a user to entity relationship, similar to bind relationship. The only difference is that, for bind relationships, it will expand the graph using the entity. For example, when we analyze a user node, it has drivers license number 12345 and we config it as a bind relationship, then all other users sharing the same driver license number will be automatically pulled and plotted on the graph. For a shared value relationship, it is a weaker relationship. System won’t pull the additional node sharing the value. Instead KG would just be based on the current graph, linking all nodes with the same property together.
Typical attributes we can configure as shared relationships include
- Account open dates
These attributes can bring unrelated users together, so we don’t config them as bind relationships. But for a group of users that have another bind relationships, if they share these attributes, we know they are additionally correlated, so would like to show these attributes on the graph.
2.4 Process Your Data
After completing the configuration shown above, click Next to go to the 4th step of the initialization of KG. The Process Your Data page is shown below. Select the dataset path and the corresponding time field type and format, and name the processing task. Once this has been completed, click the START PROCESS button for the system to begin performing batch processing on the imported data. After this step, KG initialization is complete.
When loading the data, the job status is displayed in the task center.