Overview of Supported Formats

Ravvio’s knowledge base supports a comprehensive range of document formats to ensure maximum flexibility in training your AI agent with existing business content and documentation. Document upload interface with supported format indicators

File Format Categories

Document Formats

PDF, DOCX, TXT, and MD files for comprehensive text-based content

Data Formats

CSV and JSON files for structured data and tabular information

Web Formats

HTML files for web content and formatted online documentation

Specialized Content

Markdown and structured text files for technical documentation

Detailed Format Support

PDF Documents

Microsoft Word Documents (DOCX)

1

Content Recognition

Complete extraction of text, headings, and paragraph structure
2

Formatting Preservation

Maintenance of document hierarchy, bullet points, and numbered lists
3

Metadata Extraction

Capture of document properties, titles, and author information
4

Table Processing

Recognition and extraction of tabular data and structured information

DOCX Best Practices

Content Optimization

Recommended Structure:
  • Use proper heading styles (H1, H2, H3)
  • Maintain consistent formatting throughout
  • Organize content with clear sections
  • Include descriptive titles and headers

Quality Enhancement

Preparation Tips:
  • Remove unnecessary formatting complexity
  • Ensure text is readable and well-organized
  • Use bullet points and numbered lists appropriately
  • Include comprehensive content without excessive styling

Plain Text Files (TXT)

CSV Files (Comma-Separated Values)

Data Structure

Supported Content:
  • Product catalogs with specifications
  • Pricing tables and rate sheets
  • FAQ databases with questions and answers
  • Contact directories and staff listings
  • Inventory lists and availability data

Processing Features

CSV Capabilities:
  • Header row recognition and column mapping
  • Data type detection and validation
  • Relationship identification between columns
  • Automatic formatting for readable responses

CSV Optimization

1

Header Preparation

Use clear, descriptive column headers that indicate content type
2

Data Consistency

Ensure consistent data formatting within each column
3

Content Completeness

Fill in all relevant cells to avoid incomplete information
4

Logical Organization

Arrange columns in logical order from most to least important

JSON Files (JavaScript Object Notation)

Markdown Files (MD)

Markdown Advantages

Benefits:
  • Perfect for technical documentation
  • Excellent formatting preservation
  • Code block and syntax highlighting support
  • Link and reference management
  • Table and list structure recognition

Content Types

Ideal Applications:
  • API documentation and developer guides
  • Technical specifications and requirements
  • User manuals with code examples
  • README files and project documentation
  • Knowledge base articles with formatting

Markdown Processing Features

1

Syntax Recognition

Complete support for standard Markdown syntax and formatting
2

Structure Preservation

Maintenance of headings, lists, tables, and code blocks
3

Link Processing

Recognition and preservation of internal and external links
4

Content Integration

Seamless integration with other knowledge base content

HTML Files

File Size and Limitations

Size Restrictions

Individual File Limits

Maximum file size varies by format:
  • PDF: 50MB maximum
  • DOCX: 25MB maximum
  • TXT: 10MB maximum
  • CSV: 15MB maximum
  • JSON: 10MB maximum
  • MD: 5MB maximum
  • HTML: 5MB maximum

Total Knowledge Base

Overall limitations:
  • Total storage per account varies by plan
  • Free accounts: 100MB total storage
  • Paid accounts: 1GB+ total storage
  • Enterprise: Custom storage allocations

Performance Considerations

Upload Requirements and Recommendations

Technical Requirements

1

File Integrity

Ensure files are not corrupted and open properly in their native applications
2

Format Compliance

Verify files meet format standards and are not password-protected
3

Content Relevance

Confirm content is relevant to your business and customer interactions
4

Quality Assurance

Review content for accuracy, completeness, and current information

Naming Conventions

File Names

Best practices:
  • Use descriptive names that indicate content
  • Avoid special characters and spaces
  • Include version numbers for updated documents
  • Use consistent naming across related files

Organization

Structure tips:
  • Group related documents logically
  • Use prefixes for different content types
  • Include dates for time-sensitive content
  • Maintain clear categorization system

Content Preparation Guidelines

Pre-Upload Optimization

Testing and Validation

1

Upload Testing

Test upload process with a small sample of documents first
2

Processing Verification

Confirm documents process correctly without errors
3

Content Validation

Verify extracted content matches original document intent
4

Response Testing

Test AI agent responses using content from uploaded documents
File Format Support: While Ravvio supports these formats, optimal results depend on document quality, structure, and content organization. Well-formatted, clearly structured documents will provide better AI agent performance.
Start Small: Begin with a few high-quality documents to test processing and agent responses before uploading your entire document library.
Copyright Compliance: Ensure you have proper rights to use all uploaded content and that it complies with your organization’s content policies and legal requirements.