# Starting an Ingestion Job: Amazon S3 Bucket Datasource This section walks you through the process of initiating a content ingestion job from an Amazon S3 bucket using the Ingestion API. This API is designed for customers who have external content that they wish to import into the eGain AI Knowledge Hub. To do this, you must first package and format your content according to our specified import file structure, which is detailed in the [Format Guide](/developer-portal/guides/ingestion/data-import-format-guide). Once your content folder is correctly formatted, you can upload it to your Amazon S3 bucket. From there, this API will consume the file and begin the process of importing it into your knowledge base. ## Prerequisites Before using the Ingestion API, ensure the following requirements are met: - A valid OAuth 2.0 access token with the `knowledge.contentmgr.manage` scope. - Content available in an AWS S3 bucket. - Valid AWS credentials (access key ID and secret access key) to access the S3 bucket. - Data to be ingested is packaged into a folder where it must follow a certain folder and file structure. For more information on the expected format, see [Format Guide](/developer-portal/guides/ingestion/data-import-format-guide). ## API Endpoint To start an ingestion job, make a `POST` request to the following endpoint: `https://${API_DOMAIN}/knowledge/contentmgr/v4/import/content` ## Request Body The request body must be a JSON object that specifies the data source, operation, and an optional schedule time. ### Example Payload ```json { "dataSource": { "type": "AWS S3 bucket", "path": "s3://mybucket/myfolder/", "region": "us-east-1", "credentials": { "accessKeyId": "AKIAIOSFODNN7EXAMPLE", "secretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" } }, "operation": "import", "scheduleTime": { "date": "2024-03-01T10:00:00.000Z" } } ``` ### Using cURL to Start the Ingestion Job You can use the following cURL command to start the ingestion job. Remember to replace the placeholder values with your actual data. code Bash ```curl curl --location --request POST 'https:///knowledge/contentmgr/v4/import/content' \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data-raw '{ "dataSource": { "type": "AWS S3 bucket", "path": "s3://your-bucket-name/your-folder/", "region": "your-aws-region", "credentials": { "accessKeyId": "YOUR_AWS_ACCESS_KEY_ID", "secretAccessKey": "YOUR_AWS_SECRET_ACCESS_KEY" } }, "operation": "import", "scheduleTime": { "date": "2024-03-01T10:00:00.000Z" } }' ``` ### Placeholders to Replace: - `\`: The domain of your API. - `\`: Your OAuth 2.0 access token. - `s3://your-bucket-name/your-folder/`: The path to your S3 bucket and folder. - `your-aws-region`: The AWS region where your bucket is located. - `YOUR_AWS_ACCESS_KEY_ID`: Your AWS access key ID. - `YOUR_AWS_SECRET_ACCESS_KEY`: Your AWS secret access key. ### Successful Response A successful request returns a `202 Accepted` status code. The `location` header in the response contains the URL to check the status of the import job. **Example Response Header:** ``` location: /knowledge/contentmgr/v4/import/content/7A84B875-6F75-4C7B-B137-0632B62DB0BD ```