Overview

Content residing within an AWS S3 bucket can be seamlessly and automatically imported into VIDIZMO. This process facilitates a streamlined workflow where videos and other media files stored on S3 buckets are detected and transferred into VIDIZMO's hosted portal without the need for manual intervention. Through this integration, organizations can ensure that their digital content is efficiently managed, making it readily available for access, streaming, and sharing within a video content management system.


In this article, we explore a step-by-step guide to how to ingest content from S3 Bucket in VIDIZMO.


Prerequisites 

  • You must have Administrator or Manager privileges in the VIDIZMO Portal to perform the necessary configurations.
  • Ensure that you have access to an active AWS account with the necessary permissions to create and manage an S3 bucket.
  • An AWS S3 Bucket from which you want to ingest content into the VIDIZMO Portal. Note the S3 Bucket name, Access Key, and Secret Key, as these will be required during the configuration process.


Note: In the VIDIZMO Portal, you can set up a storage provider of your choice. It is not mandatory to configure AWS storage provider for ingesting content from an AWS S3 bucket. You can select and configure a storage provider according to your preferences.  


Ingesting Content from S3 Bucket

To ingest content from S3 Bucket in VIDIZMO, follow the steps:

  1. Log into the Portal and click the Menu icon on the top left-hand corner of the screen to open the left navigation pane. 

  2. Expand the Admin section by clicking on the down arrow.  

  3. Click on Portal Setting from the navigation panel.




4. On the Portal Settings navigation pane.  
5. Select Apps
6. Go to Content Ingestion.
7. Select the gear icon to configure the S3 Bucket App.



Configuring S3 Bucket App



  1. Enter the Access Key for the AWS account containing the S3 bucket from which you intend to ingest your content. This key is essential for authentication and access to the specified AWS resources.
  2. Enter the Secret Key for the AWS account that has the S3 bucket from which you intend to ingest your content. This key is necessary for authentication and accessing the specified resources. For additional guidance, consult the "AWS Identity and Access Management" guide in the AWS documentation. 
  3. Specify the region of your S3 bucket. Ensure that the region specified for your AWS S3 bucket aligns with the designated region. 
  4. Provide the name of the AWS S3 bucket from which you intend to ingest content.
  5. Ingested Content Setup

    1. How do you want to organize content: Select the import mode to display ingested content in VIDIZMO. Choose "Hierarchy" to maintain the original folder structure or "Flat" to import each item individually without folders. To learn this concept, refer to "Content organization preference."
    2. Include/Exclude Folders: Kindly note that fields for include/exclude folders are optional. If not specified, the entire content of the S3 bucket will be ingested according to the above-selected content organization preferences. Click the Add button to include folders in content ingestion. Specify the path of the folder in the prompted text field. For example: "Users\Username1\Documents\ProjectFiles." To add more folders, press the Add button again, generating an additional text field for each subsequent folder.
    3. Actions for source content: Select from the drop-down menu to define post-ingestion action.
      • Keep Content Unaltered Post Ingestion:  Implies that the content remains unchanged after ingestion.
      • Delete from Source Bucket Post Ingestion:  Opting for this action will automatically delete content inside the bucket after it has been successfully ingested.
      • Move Content to S3 Folder Post Ingestion:  By selecting this option, users can specify a target AWS bucket and a designated folder path within that bucket where the content should be relocated post-ingestion. Selecting this option will include additional information that you need to provide:
        1. Target S3 Bucket Name: Enter the name of the AWS bucket to which the content will move after ingestion.
        2. S3 Destination Folder: Specify the S3 bucket directory where the content should be relocated after ingestion in the portal. Example: folder/subfolder. It's important to note that if the specified folder doesn't exist in the bucket, AWS will create it accordingly.
    4. Utilize the drop-down menu to specify the content state after ingestion:
      • Publish: Automatically publish ingested content. 
      • Drafted: Retain ingested content in the draft Tab. 
    5. Choose the viewing access for ingested content.
      • Portal Security/Publish Settings: Viewing access is determined by portal settings configured in the control panel.
      • Anonymous Users: This option allows anonymous users to view content. Not available in the DEMS package but accessible in the Enterprise Tube package if using the AWS S3 ingestion app.
      • Portal Users: All portal users can view ingested content.
      • Account and Portal Users: Both account and portal users have access to view ingested content.
    6. Time Interval: Specify the time interval, i.e., the number of seconds the system enters a state of rest with no active tasks or operations after completing one ingestion cycle. The minimum recommended value for this interval is 5 seconds.



6. Click on the arrow to reveal Advance Settings. 

7. Content File Grouping

As part of your content ingestion process, you can configure file grouping to organize your files better. To understand the concept, refer to "Content File Grouping."


Choose a File Group Type to determine your preferred method for organizing content:

  • None: When the user selects "None," no grouping is applied.  
  • Substring: Group files based on common character count in a file name.
  • Regular Expression: Group files using a specific pattern.
  •  Last Folder: Group files based on the last folder.

Note: Please be advised that to categorize files based on File Group Type, such as Substring/Regex/Last Folder, it's important for the file names to be similar. Without commonalities among the folder names, the content may not be successfully ingested. For example audio123.mp3, audio45.vtt, and audio89.mp4.


By default, the None option is selected, indicating that all files are ingested as Original Content.



If you selected Substring, configure the following fields:



  1. Start Position: Specify the numeric start index for substring grouping. A substring will be extracted from the file name starting at this position.  
  2. No.of characters to include: Provide the number of characters to take for substring from the file name after Start Index.
  3. Minimum Group File Count  (Regardless of the file group type chosen, this field is mandatory). Set the minimum number of files for each group. For optimal content ingestion in VIDIZMO, it is recommended to input a minimum count of 2 files to make a group.

For example, Files like "Audio_Song.wav" and "Audio_Song.json" are grouped together because they share the common substring "Audio_". 


If you chose Regular Expression, configure this field:



  1. Regex Pattern for Grouping: Define the regex pattern for grouping media files. You can create and test your custom file grouping regex at regex101: build, test, and debug regex. 


Sample Regex: (?<GroupName>(\d|[a-zA-Z])+)\.(mp4|vtt|json|txt|wav|png|ext) ?<GroupName>.This part in regular expression is mandatory. After the group name part, provide an RE pattern to extract common strings from file names that belong to a group. Regex provided for grouping must contain "(?<GroupName>your_pattern)". "your_pattern" should be replaced with the desired pattern, which would then be used as a group name. "?<GroupName>" is a variable that would contain the name of the group to be created. A group is equivalent to a mashup. Therefore, multiple groups mean multiple mashups to be ingested. 


It is pertinent to mention that the users should themselves verify and input valid regex at their discretion.

 

  • Example 1: (?<GroupName>[a-z].*)\.wav

The above regex creates a group from a file's name having any number of characters in lowercase. This is because the "?<GroupName>" appears on the left side of ".". 

  • Example 2: .*(?<GroupName>best).*\.wav

The above regex creates a group from the file name containing the word "best." The group name will be "best".

  • Example 3: .*(?<GroupName>best|ant).*\.wav

The above regex creates a group from the file name containing either "best" or "ant". Files containing "best" would be grouped in the "best" group, and the same applies to "ant".

 

The ".wav" extension should be replaced with the extension of the files the users may wish to group. Multiple patterns can be given separated by a pipe operator "|".


B. Minimum File Count in a Group: Set the minimum number of files in a group. The minimum file group count concept remains valid. For example, if the minimum group file count is set to 2, grouping will only happen when there are at least 2 files sharing the same prefix. This ensures that files are grouped together only when they satisfy this minimum count condition.


If you chose Last Folder, configure this field:


  1. The files are grouped based on the content of their last folder. 
  2. In the last folder, only one field is necessary, specifically the "Minimum File Count in a Group. This implies that files will only be grouped when there is at least the specified minimum number of files present within the same folder. 



To better understand how to group files, refer to our article Understanding Ingestion of Content from an AWS S3 Bucket in VIDIZMO.


8. Content File Type Mapping

Define rules for mapping associated metadata files post-content ingestion. This section consists of content file parts, each responsible for storing specific types of files. Users can define rules to determine which file type belongs to which part. Multiple rules can be specified, each with its own criteria and associated content file part. If a file meets any of the provided criteria, it is placed into the corresponding content part.


Users can store files in multiple formats, such as .vtt and .json, with no fixed association between file format and specific parts; users have the flexibility to designate which part of the content file should contain each format. However, users should be careful when defining rules to prevent placing files in unintended parts, which could result in malfunctioning.


The following media file sections are as follows:
  • Audio PCM: Reserved for digitally encoded audio data using PCM.
  • Closed Caption:  Designated to store closed captions associated with video content.
  • Content: A section dedicated to the primary content files.
  • Supporting Files: This section is capable of storing files that support the main content, such as metadata, additional documentation, or related files.
  • Thumbnails: Designated for storing thumbnail images associated with the content.
  • Original Content: Reserved for the storage of the original content file.


Note: Having at least one rule for the media file section with the media file section option "OriginalContent" is mandatory. Moreover, if a user does not specify a media file section rule for a file, then the file can be located in the Supporting File section.


  1. Map ingested file(s) to: Choose media file sections to store associated media files in the selected section from the drop-down menu.
  2. Regex For File Type: Specify the regex pattern for media files section rules. Example: Regex Pattern   .*\.mp4 will select all .mp4 files and then store them in the chosen Media File Section. Add more by pressing "Add Section" for additional text fields. To include all files, input ".*" in the Regex pattern field.
Note:  Selecting Grouping File: None restricts mapping ingested files solely to Original Content. Define a file type's regex pattern in Content File Type Mapping to ingest only that specific type as Original Content.



9. Select Save Changes.


Enabling S3 Bucket App

  1. Initiate the content ingestion process by enabling the toggle button on the content ingestion screen.



2. Click on the progress option to view the status of content ingestion.



The application operates in three distinct states: 

  • Iteration Start: This initial state indicates that the ingestion process has started, transferring content from the AWS S3 bucket to VIDIZMO as the current state shows Importing Content.
  • Importing Content: In this phase, the modal displays information on the 

    ingesting content.

  • Files Discovered to Ingest displays the count of files identified for ingestion into the portal from the bucket.

  • Ingested Content Count displays the current file count that is ingested along with the content file parts that will be ingested in Media File Sections.

  • Total Content to ingest in iteration displays the overall content count, reflecting the total number of files in the system in an iteration.

  • Iteration Completed: The final state signifies the completion of content ingestion from the AWS S3 bucket to VIDIZMO.