Keyword Search API - Document Level

Document level keyword search API makes it possible for the content search based on search option.

Keyword Search API - Document Level

POST https://backend.alli.ai/webapi/keyword_search/documents

Document level keyword search API makes it possible for the content search based on search option.

Getting the API KEY

Please provide your API key in the request header API-KEY. Your API key can be found in your dashboard Settings menu, under the General tab.

Request

Provide your API KEY in the request header API-KEY.

NameTypeDescription

API-KEY*

string

Your cognitive search API key can be found in your dashboard Settings menu, under the General tab.

Request Body

NameTypeDescription

topN

int

How many result want to get (default: 10)

node*

SearchNode

Search data model for keyword search condition

hashtags

HashtagNode

Search data model that includes hashtags condition

excludingHashtags

HashtagNode

Search data model that excludes hashtags condition

Search Node

NameTypeDescription

WildCardNode

Search for documents where containing input keyword with wildcard.

ExactlyPhraseNode

Search for documents where containing input phrase exactly.

WithinXWordsNode

Search for documents where input1 keyword and input2 keyword are separated by a maximum distance.

ExcludeKeywordNode

Search for documents where not containing input keyword. This SearchNode utilizes AND and OR nodes to combine with other SearchNodes.

AndNode

Search for documents where Include all nodes result documents Requirement: 1. nodes need to include at least two nodes. 2. Just use on Top level

OrNode

Search for documents where Include any nodes result documents Requirement: 1. nodes need to include at least two nodes.

Hashtag Node

NameTypeDescription

hashtags

array[string]

Hashtags to filter the list. Only the documents with these hashtags are searched. You can add multiple hashtags separated by a comma.

combinedHashtags

array[array[string]]

Documents you uploaded to the dashboard have hashtags. You can set the search scope to allow for groups of hashtags.

Note: If "combinedHashtags" are used, you must keep the "hashtags" parameter empty.

hashtagsOperator

string

and or or. If it's and, 'hashtags' filter for multiple hashtags works with AND logic. If or, OR logic.

combinedHashtagsOperator

string

Either and or or. Choose and if you want both of the combined hashtags in the results.

Choose or if you want at least one of the combined hashtags in the results.

Example

Request Example

Please replace YOUR_API_KEY with your one in the example below. Please see getting-api-key section.

curl https://backend.alli.ai/webapi/keyword_search/documents \
-d '{"topN": 2, "node": { "nodeType": "wildcard", "input": "hello*" }}' \
-H "Content-Type: application/json" \
-H "API-KEY: YOUR_API_KEY"

Response Example

{
  "projectId": "PROJECT_ID",
  "results": [
    {
      "knowledgeBaseId": "KNOWLEDGE_BASE_ID",
      "title": "Allganize Document"
    },
    {
      "knowledgeBaseId": "KNOWLEDGE_BASE_ID",
      "title": "What is the AI"
    }
  ]
}

Combine Node Example

  • Search for documents which has hello and world and has not good

{
    "node": {
        "nodeType": "AND"
        "nodes": [
            {
                "nodeType": "EXACTLY_PHRASE",
                "input": "hello"
            },
            {
                "nodeType": "WILDCARD",
                "input": "world*"
            },
            {
                "nodeType": "EXCLUDE_KEYWORD",
                "input": "good"
            }
        ]
    }
}
  • Search for documents which has hello or world*

{
    "node": {
        "nodeType": "OR"
        "nodes": [
            {
                "nodeType": "EXACTLY_PHRASE",
                "input": "hello"
            },
            {
                "nodeType": "WILDCARD",
                "input": "world*"
            }
        ]
    }
}
  • Search for documents which has hello or ( has world* and has not good )

This case not support. because "AndNode" is not root

Hashtag Example

Want to search for a list of documents that contain 'hello*' with hashtag conditions

  • Search for documents which has the hashtag test1

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "hashtags": {
        "hashtags": ["test1"]
    }
}
  • Search for documents which has the hashtag test1 or test2

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "hashtags": {
        "hashtags": ["test1", "test2"],
        "hashtagsOperator": "or"
    }
}
  • Search for documents which has the hashtag test1 and test2

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "hashtags": {
        "hashtags": ["test1", "test2"],
        "hashtagsOperator": "and"
    }
}
  • Search for documents which has the hashtag (( test1 and test2 ) or ( test3 and test4 ))

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "hashtags": {
        "combinedHashtags": [["test1", "test2"], ["test3", "test4"]],
        "hashtagsOperator": "or",
        "combinedHashtagsOperator": "and"
    }
}
  • Search for documents which has the hashtag (( test1 or test2 ) and ( test3 or test4 ))

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "hashtags": {
        "combinedHashtags": [["test1", "test2"], ["test3", "test4"]],
        "hashtagsOperator": "and",
        "combinedHashtagsOperator": "or"
    }
}
  • Search for documents without the hashtag test1

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "excludingHashtags": {
        "hashtags": ["test1"]
    }
}
  • Search for documents without the hashtag test1 or test2

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "excludingHashtags": {
        "hashtags": ["test1", "test2"],
        "hashtagsOperator": "or"
    }
}
  • Search for documents without the hashtag test1 and test2

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "excludingHashtags": {
        "hashtags": ["test1", "test2"],
        "hashtagsOperator": "and"
    }
}
  • Search for documents which has the hashtag (( test1 or test2 ) and ( test3 or test4 )) and without the hashtag test1 and test3 :

{
    "topN": 10,
    "node": {
        "nodeType": "wildcard",
        "input": "hello*"
    },
    "hashtags": {
        "combinedHashtags": [["test1", "test2"], ["test3", "test4"]],
        "hashtagsOperator": "and",
        "combinedHashtagsOperator": "or"
    },
    "excludingHashtags": {
        "hashtags": ["test1", "test3"],
        "hashtagsOperation": "and"
    }
}

Limitation

Search Request

  • Exclude Search must be used in combination with Include Search.

  • Wildcard Search is supported for word.

    • Wildcards attached to phrases like "hello world*" are not supported

Search Result

  • Pagination support: Not available.

  • Exactly Phrase and With X queries supported for data belonging to consecutive 2 pages.

Search Quality

  • Presence of Header and Footer may affect proper search results.

  • Unrecognized control characters in documents may cause search issues.

  • Documents with no word wrap between pages may not be properly searchable.

  • We manage the index in Elasticsearch by adding commonly recommended analyzers for English sentences. e.g.) Normalization of verb tenses (past tense, present tense). Due to the related analyzer, some indexed searches may not be clear, and in such cases, searches should be conducted using exactly_phrase.

Last updated