UpstageAI/mcp-upstage-server
TypeScript
Captured source
source ↗UpstageAI/mcp-upstage-server
Description: Node.js/TypeScript MCP server for Upstage AI document processing with parsing, information extraction, schema generation, and classification tools
Language: TypeScript
Stars: 2
Forks: 1
Open issues: 5
Created: 2025-09-03T11:00:24Z
Pushed: 2026-02-11T17:36:52Z
Default branch: main
Fork: no
Archived: no
README:
MCP-Upstage-Server
Node.js/TypeScript implementation of the MCP server for Upstage AI services.
Features
- Document Parsing: Extract structure and content from various document types (PDF, images, Office files)
- Information Extraction: Extract structured information using custom or auto-generated schemas
- Schema Generation: Automatically generate extraction schemas from document analysis
- Document Classification: Classify documents into predefined categories (invoice, receipt, contract, etc.)
- Built with TypeScript for type safety
- Dual transport support: stdio (default) and HTTP Streamable
- Async/await pattern throughout
- Comprehensive error handling and retry logic
- Progress reporting support
Installation
Prerequisites
- Node.js 18.0.0 or higher
- Upstage API key from Upstage Console
Install from npm
# Install globally npm install -g mcp-upstage-server # Or use with npx (no installation required) npx mcp-upstage-server
Install from source
# Clone the repository git clone https://github.com/UpstageAI/mcp-upstage.git cd mcp-upstage/mcp-upstage-node # Install dependencies npm install # Build the project npm run build # Set up environment variables cp .env.example .env # Edit .env and add your UPSTAGE_API_KEY
Usage
Running the server
# With stdio transport (default) UPSTAGE_API_KEY=your-api-key npx mcp-upstage-server # With HTTP Streamable transport UPSTAGE_API_KEY=your-api-key npx mcp-upstage-server --http # With HTTP transport on custom port UPSTAGE_API_KEY=your-api-key npx mcp-upstage-server --http --port 8080 # Show help npx mcp-upstage-server --help # Development mode (from source) npm run dev # Production mode (from source) npm start
Integration with Claude Desktop
Option 1: stdio transport (default)
{
"mcpServers": {
"upstage": {
"command": "npx",
"args": ["mcp-upstage-server"],
"env": {
"UPSTAGE_API_KEY": "your-api-key-here"
}
}
}
}Option 2: HTTP Streamable transport
{
"mcpServers": {
"upstage-http": {
"command": "npx",
"args": ["mcp-upstage-server", "--http", "--port", "3000"],
"env": {
"UPSTAGE_API_KEY": "your-api-key-here"
}
}
}
}Transport Options
stdio Transport (Default)
- Pros: Simple setup, direct process communication
- Cons: Single client connection only
- Usage: Default mode, no additional configuration needed
HTTP Streamable Transport
- Pros: Multiple client support, network accessible, RESTful API
- Cons: Requires port management, network configuration
- Endpoints:
POST /mcp- Main MCP communication endpointGET /mcp- Server-Sent Events streamGET /health- Health check endpoint
Available Tools
parse_document
Parse a document using Upstage AI's document digitization API.
Parameters:
file_path(required): Path to the document fileoutput_formats(optional): Array of output formats (e.g., ['html', 'text', 'markdown'])
Supported formats: PDF, JPEG, PNG, TIFF, BMP, GIF, WEBP
extract_information
Extract structured information from documents using Upstage Universal Information Extraction.
Parameters:
file_path(required): Path to the document fileschema_path(optional): Path to JSON schema fileschema_json(optional): JSON schema as stringauto_generate_schema(optional, default: true): Auto-generate schema if none provided
Supported formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX
generate_schema
Generate an extraction schema for a document using Upstage AI's schema generation API.
Parameters:
file_path(required): Path to the document file to analyze
Supported formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX
This tool analyzes a document and automatically generates a JSON schema that defines the structure and fields that can be extracted from similar documents. The generated schema can then be used with the extract_information tool when auto_generate_schema is set to false.
Use cases:
- Create reusable schemas for multiple similar documents
- Have more control over extraction fields
- Ensure consistent field naming across extractions
The tool returns both a readable schema object and a schema_json string that can be directly copied and used with the extract_information tool.
classify_document
Classify a document into predefined categories using Upstage AI's document classification API.
Parameters:
file_path(required): Path to the document file to classifyschema_path(optional): Path to JSON file containing custom classification schemaschema_json(optional): JSON string containing custom classification schema
Supported formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX
This tool analyzes a document and classifies it into categories. By default, it uses a comprehensive set of document types, but you can provide custom classification categories.
Default categories:
- invoice, receipt, contract, cv, bank_statement, tax_document, insurance, business_card, letter, form, certificate, report, others
Use cases:
- Automatically sort and organize documents by type
- Filter documents for specific processing workflows
- Build document management systems with automatic categorization
Schema Guide for Information Extraction
When auto_generate_schema is false, you need to provide a custom schema. Here's how to format it correctly:
📋 Basic Schema Structure
The schema must follow this exact structure:
{
"type": "json_schema",
"json_schema": {
"name": "document_schema",
"schema": {
"type": "object",
"properties": {
"field_name": {
"type": "string|number|array|object",
"description": "Description of what to extract"
}
}
}
}
}❌ Common Mistakes
Wrong: Missing nested structure
{
"company_name": {
"type": "string"
}
}Wrong: Incorrect response_format
{
"schema": {
"company_name": "string"
}
}Wrong: Missing properties wrapper
{
"type": "json_schema",…Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Low stars, routine repo