Chapter 6 - Appendix

DataVisor Packages: Generic, Transaction Fraud, Application Fraud, Promo Abuse, Anti-Money Laundering, Global Intelligence Network. Based on your purchase options one or more will be visible.

DataVisor Operator Function List:

Operator name

Description

AND

If both Condition one and two are satisfied, returns True - otherwise returns False

AVERAGE

Calculate average value of a feature.

The sourceFeatureName should be tagged as time series in event config.

Argument values list of values to calculate average from, while isToRoundValue is a Boolean Argument.

AVERAGE_GREATER_THAN_OR_EQUAL_TO

Compare the average of the list of numeric values against a base value, and returns True or False.

BOT_PATTERN_EVENTS

Check the interval between a series of events (all event types), and detect if those events are created by bot. We then return True or False.

BUCKETIZE_FOR_E_ATTRIBUTE

Generate an integer value that bucketizes a non-negative input value according to desired type of bucketization and desired sizes of the buckets.

BUCKETIZE_FOR_U_ATTRIBUTE

Generate a set of integer values that bucketize a set of non-negative input values according to desired process type of the input set, the desired type of bucketization and desired sizes of the buckets.

CAP_COUNT

If value of sourceFeatureName is greater than capCountThreshold, return capCountThreshold, otherwise return value

CHECK_CHINESE_STRINGS_MISMATCH_IN_TWO_FIELD

Check whether the applicant has mismatched Chinese Addresses/Company names in two fields (normally one from application form, the other from credit report). It's a "fuzzy" check that it handles the case when they are not exactly-matched, but actually representing the same thing.

CHECK_CREDIT_USAGE_PERCENTAGE

Used in credit card application scenarios to check whether the user's credit usage percentage is high. It returns True or False, and True indicates whether the usage percentage is high.

CHECK_MULTI_DISTINCT_CHINESE_STRINGS_IN_ONE_FIELD

Check whether the applicant fills multiple different Chinese Addresses/Company names in different applications. It's a "fuzzy" check that it handles the case when they're not exactly-matched, but actually represents the same thing.

CHINESE_ADDRESS_MISMATCH

Check whether two Chinese Address fields are mismatched. It's a "fuzzy" check that it handles the case when they are not exactly-matched, but actually representing the same address. The higher the threshold, the higher chance they are mismatched.

CHINESE_COMPANY_MISMATCH

Check whether two Chinese Company fields are mismatched. It's a "fuzzy" check that it handles the case when they are not exactly-matched, but actually representing the same company. The higher the threshold, the higher chance they are mismatched.

CHINESE_MOBILE_FROM_VIRTUAL_CARRIER

Generate a boolean value that describes whether a Chinese mobile number is from a virtual mobile carrier.

CHINESE_MOBILE_PREFIX_LOCATION

Find the location that the given mobile first registered.

The location scales that available to choose are: city, province, city_and_province (separated by ", ")

CHINESE_NAME_RARENESS_AVG_SCORE

Compute the average score of Chinese names. The higher the score, the rarer the name.

Argument chineseName : Chinese name (Expected name format should be e.g. 张三)

Argument capSize : (Default value: -1.0) Cap size of the output value. Do not cap if capSize is -1.0.

It returns the average score indicating how rare the chinese name is, the higher the score, the rarer the name is

CITY_JUMPER

Check if the user has changed IP too often. For this operator, DataVisor recommends the default values we set for cityJumperEventsLowerLimit, RelaxedLimit and EnormourEventLimit.

The output of this operator is True or False

CLEAN_CHINESE_ADDRESS

Return the cleaned address which only retains the core information of the address.

CLEAN_INVALID_CHINESE_PHONE_NUMBER

Delete the invalid phone tel number, especially useful in Chinese financial scenarios.

CLEAN_INVALID_EMAIL

Delete the invalid email prefix, especially useful in Chinese financial scenarios. In particular, this will return empty ("") for email prefix that has a length of less than 4 characters and is formed by less than 3 different characters (such as "baaa"), OR consists of 4 continuous number - even if there are other characters in between (such as "1a2c3b4"). Otherwise, it will return the original string.

COLLECT

A velocity operator which aggregates all values and save into a list

CONCATENATE

Concatenate 2 or more strings into one

COUNT

A velocity operator which counts all values and returns the result as a number

DELETE_PREFIX

Check if the value of an input string has a prefix that belongs to the list of invalid prefixes and deletes the prefix if the prefix is invalid.

DELETE_SUFFIX

Check if the value of an input string has a suffix that belongs to the list of invalid suffixes and deletes the suffix if the suffix is invalid.

DIGIT_DIS_TO_TAIL

Look for the last numeric digit and calculate the distance till the end of username string

DISTINCT_COUNT

A velocity operator which counts all distinct values and returns the result as a number

DIVIDE

Does true division for numeric values.

EATTR_CONCATENATE

Unlike the "CONCATENATE" operator, this operator will add "_" in between two strings that it concatenates. In addition, if one of the strings is empty, it will return empty instead.

EMAIL_IS_ALEXA_TOP_DOMAIN

Check if the domain is common by checking against Alexa database

EMAIL_IS_BAD_PROVIDER

Check if the top email domain is rare, based on DataVisor database

EMAIL_IS_UNIVERSITY_DOMAIN

Check if the email domain is from a university

EMAIL_IS_VALID_FORMAT

Check if emails provided are in valid format

EMAIL_IS_VALID_TOP_DOMAIN

Check if the top email domain is common, based on DataVisor database

EMAIL_PREFIX

Extract the string before @ in email addresses

EMAIL_PREFIX_CONtAINS

Extract and return the substring that contains plus sign (+) in the email's prefix.

EMAIL_PREFIX_IS_NORMAL_VOWEL_RATIO

Return the rounded integer ratio of vowel characters over the total number of characters in email prefix.

If the ratio is more than 0.25, and less than 0.6, then it returns True

EMAIL_PREFIX_LENGTH

Measure the length of prefix in an email (string before @)

EMAIL_PREFIX_NUM_DIGIT

Count the number of digits in prefix of an email (string before @)

EMAIL_PREFIX_NUM_SPECIAL_CHAR

Count the number of special characters in prefix of an email (string before @)

Special characters that are allowed in email addresses: !#$%&'*+-/=?^_`{|}~ ;

EMAIL_PREFIX_SPECIAL_CHAR

Extract special characters in prefix of an email (string before @), and returns as a string

Special characters that are allowed in email addresses: !#$%&'*+-/=?^_`{|}~ ;

EMAIL_PROVIDER

Derive the provider of email addresses

EMAIL_SIMPLIFIED_PREFIX

Remove all numbers and some characters: "+", "-", "_", "."

EVENT_CAP_COUNT_IN_DAILY_HOUR_RANGE

This operator is used to capture a user's consecutive event counts based on the given event types.

Please note: Two events across different days without unqualified events in the middle will be considered as "consecutive".

EXTRACT_DEVICE_FAMILY_FROM_USER_AGENT

Extract the general device family category from a user-agent string. For instance, with a user-agent string. It will then return the general device family category of the user-agent.

Argument userAgent User-agent attribute value.

EXTRACT_OS_FAMILY_FROM_USER_AGENT

Extract the general os family category from a user-agent string. It returns the general os family category of the user-agent.

EXTRACT_PHONE_NUMBER

Extract phone numbers within given length range from texts

EXTRACT_Q_Q_NUMBER_FROM_CHINESE_COMMENT

Extract QQ ID number from Chinese comment. It's commonly used in social scenarios.

Argument sourceString: the event attribute that needs to be bucketized

It returns a hashset which including all the QQ number

EXTRACT_REGEX

Extract strings that match the given regular expression from texts.

EXTRACT_SPECIAL_CHAR

Return special characters in a username as string

EXTRACT_SUBSTRING

Return the substring between two index values.

EXTRACT_USER_AGENT_PROD_NAME

Extract the product identifier from a user-agent string. See for detail syntax of user-agent: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent

Argument userAgent User-agent attribute value.

It returns the product identifier of the user-agent.

EXTRACT_VALUE_FROM_URL

Extract the value of a field with name given by fieldName from an URL or click_referer.

EXTRACT_WECHAT_FROM_CHINESE_COMMENT

Extract wechat number from Chinese comment. It's commonly used in social scenarios.

Argument sourceString: the event attribute that needs to be bucketized

It returns a hashset which includes all the wechat numbers

GET_AGE_FROM_CHINESE_NATIONAL_ID_NUMBER

Calculate the age of the user from the user 's Chinese National id prefix.

Argument certNo chinese national id, then it returns the age of the user'

GET_CITY_BY_GPS

Get cityName by gps

1:city,

2:the pinyin of city,

3:province,

4:the pinyin of province

by cn id card number.

Argument latitude

Argument longitude

Argument valueLoc: which value [1-4]

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.