Starting an Ingestion Job: Amazon S3 Bucket Datasource

This section walks you through the process of initiating a content ingestion job from an Amazon S3 bucket using the Ingestion API. This API is designed for customers who have external content that they wish to import into the eGain AI Knowledge Hub. To do this, you must first package and format your content according to our specified import file structure, which is detailed in the Format Guide. Once your content folder is correctly formatted, you can upload it to your Amazon S3 bucket. From there, this API will consume the file and begin the process of importing it into your knowledge base.

Prerequisites

Before using the Ingestion API, ensure the following requirements are met:

A valid OAuth 2.0 access token with the knowledge.contentmgr.manage scope.
Content available in an AWS S3 bucket.
Valid AWS credentials (access key ID and secret access key) to access the S3 bucket.
Data to be ingested is packaged into a folder where it must follow a certain folder and file structure. For more information on the expected format, see Format Guide .

API Endpoint

To start an ingestion job, make a POST request to the following endpoint:

https://${API_DOMAIN}/knowledge/contentmgr/v4/import/content

Request Body

The request body must be a JSON object that specifies the data source, operation, and an optional schedule time.

Example Payload

Copy

Copied

{
  "dataSource": {
    "type": "AWS S3 bucket",
    "path": "s3://mybucket/myfolder/",
    "region": "us-east-1",
    "credentials": {
      "accessKey": "AKIAIOSFODNN7EXAMPLE",
      "secretKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  },
  "operation": "import",
  "scheduleTime": {
    "date": "2024-03-01T10:00:00.000Z"
  }
}

Using cURL to Start the Ingestion Job

You can use the following cURL command to start the ingestion job. Remember to replace the placeholder values with your actual data. code Bash

Copy

Copied

curl --location --request POST 'https://<API_DOMAIN>/knowledge/contentmgr/v4/import/content' \
--header 'Authorization: Bearer <YOUR_ACCESS_TOKEN>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "dataSource": {
        "type": "AWS S3 bucket",
        "path": "s3://your-bucket-name/your-folder/",
        "region": "your-aws-region",
        "credentials": {
            "accessKey": "YOUR_AWS_ACCESS_KEY_ID",
            "secretKey": "YOUR_AWS_SECRET_ACCESS_KEY"
        }
    },
    "operation": "import",
    "scheduleTime": {
        "date": "2024-03-01T10:00:00.000Z"
    }
}'

Placeholders to Replace:

\<API_DOMAIN> : The domain of your API.
\<YOUR_ACCESS_TOKEN> : Your OAuth 2.0 access token.
s3://your-bucket-name/your-folder/ : The path to your S3 bucket and folder.
your-aws-region : The AWS region where your bucket is located.
YOUR_AWS_ACCESS_KEY_ID : Your AWS access key ID.
YOUR_AWS_SECRET_ACCESS_KEY : Your AWS secret access key.

Successful Response

A successful request returns a 202 Accepted status code. The location header in the response contains the URL to check the status of the import job. Example Response Header:

Copy

Copied

location: /knowledge/contentmgr/v4/import/content/7A84B875-6F75-4C7B-B137-0632B62DB0BD