ForkSiliconFlowSiliconFlowpublished Dec 16, 2025seen 5d

siliconflow/Comfyui-SecNodes

forked from 9nate-drake/Comfyui-SecNodes

Open original ↗

Captured source

source ↗
published Dec 16, 2025seen 5dcaptured 16hhttp 200method plain

siliconflow/Comfyui-SecNodes

Description: Comfyui implementation of OpenIXCLab Sec-4B

Language: Python

License: Apache-2.0

Stars: 0

Forks: 0

Open issues: 0

Created: 2025-12-16T12:58:43Z

Pushed: 2025-12-16T12:59:17Z

Default branch: main

Fork: yes

Parent repository: 9nate-drake/Comfyui-SecNodes

Archived: no

README:

ComfyUI SeC Nodes

ComfyUI custom nodes for SeC (Segment Concept) - State-of-the-art video object segmentation that outperforms SAM 2.1, utilizing the SeC-4B model developed by OpenIXCLab.

Changelog

v1.2 (2025-10-16) - FP8 Removal & Performance Optimizations

⚠️ IMPORTANT BREAKING CHANGE: FP8 support has been removed due to fundamental numerical instability issues. Use FP16 or BF16 models instead.

What Changed:

  • FP8 quantization disabled - produces NaN values in language model embeddings during scene detection
  • All users should migrate to FP16 or BF16 models (same segmentation quality, fully reliable)
  • Memory optimization: Pre-allocated output tensor (saves 600-800MB VRAM spike)
  • Scene detection resolution optimization: 1024x1024 → 512x512 (saves 200-400MB, no quality impact)

Full Technical Details: See [CHANGELOG.md](CHANGELOG.md) for comprehensive investigation and FP8 failure analysis.

v1.1 (2025-10-13) - Single-File Models

  • Single-file model formats: Download just one file instead of sharded 4-file format

Download: Single-file models available at https://huggingface.co/VeryAladeen/Sec-4B

What is SeC?

SeC (Segment Concept) is a breakthrough in video object segmentation that shifts from simple feature matching to high-level conceptual understanding. Unlike SAM 2.1 which relies primarily on visual similarity, SeC uses a Large Vision-Language Model (LVLM) to understand *what* an object is conceptually, enabling robust tracking through:

  • Semantic Understanding: Recognizes objects by concept, not just appearance
  • Scene Complexity Adaptation: Automatically balances semantic reasoning vs feature matching
  • Superior Robustness: Handles occlusions, appearance changes, and complex scenes better than SAM 2.1
  • SOTA Performance: +11.8 points over SAM 2.1 on SeCVOS benchmark

How SeC Works

1. Visual Grounding: You provide initial prompts (points/bbox/mask) on one frame 2. Concept Extraction: SeC's LVLM analyzes the object to build a semantic understanding 3. Smart Tracking: Dynamically uses both semantic reasoning and visual features 4. Keyframe Bank: Maintains diverse views of the object for robust concept understanding

The result? SeC tracks objects more reliably through challenging scenarios like rapid appearance changes, occlusions, and complex multi-object scenes.

Demo

https://github.com/user-attachments/assets/5cc6677e-4a9d-4e55-801d-b92305a37725

*Example: SeC tracking an object through scene changes and dynamic movement*

https://github.com/user-attachments/assets/9e99d55c-ba8e-4041-985e-b95cbd6dd066

*Example: SAM fails to track correct dog for some scenes*

Features

  • SeC Model Loader: Load SeC models with simple settings
  • SeC Video Segmentation: SOTA video segmentation with visual prompts
  • Coordinate Plotter: Visualize coordinate points before segmentation
  • Self-Contained: All inference code bundled - no external repos needed
  • Bidirectional Tracking: Track from any frame in any direction

Installation

Option 1: ComfyUI-Manager (Recommended - Easiest)

1. Install ComfyUI-Manager (if you don't have it already):

  • Get it from: https://github.com/ltdrdata/ComfyUI-Manager

2. Download a model (see Model Download section below)

3. Install SeC Nodes:

  • Open ComfyUI Manager in ComfyUI
  • Search for "SeC" or "SecNodes"
  • Click Install
  • Click Restart when prompted

4. Done! The SeC nodes will appear in the "SeC" category

Option 2: Manual Installation

Step 1: Install Custom Node

cd ComfyUI/custom_nodes
git clone https://github.com/9nate-drake/Comfyui-SecNodes

Step 2: Install Dependencies

ComfyUI Portable (Windows):

cd ComfyUI/custom_nodes/Comfyui-SecNodes
../../python_embeded/python.exe -m pip install -r requirements.txt

Standard Python Installation (Linux/Mac):

cd ComfyUI/custom_nodes/Comfyui-SecNodes
pip install -r requirements.txt

Step 3: Restart ComfyUI

The nodes will appear in the "SeC" category.

---

Model Download

Download ONE of the following model formats:

The SeC Model Loader will automatically detect and let you select which model to use. Download from https://huggingface.co/VeryAladeen/Sec-4B and place in your ComfyUI/models/sams/ folder:

  • SeC-4B-fp16.safetensors (Recommended) - 7.35 GB
  • Best balance of quality and size
  • Works on all CUDA GPUs
  • Recommended for all systems
  • SeC-4B-bf16.safetensors (Alternative) - 7.35 GB
  • Alternative to FP16, better for some GPUs
  • SeC-4B-fp32.safetensors (Full Precision) - 14.14 GB
  • Maximum precision, highest VRAM usage
  • Better compatibility on some older GPUs

⚠️ FP8 Support Removed (v1.2)

  • FP8 quantization has been removed due to numerical instability issues
  • All users should use FP16 or BF16 models instead (same quality, fully reliable)
  • See [CHANGELOG.md](CHANGELOG.md) for full technical investigation

Alternative: Original Sharded Model

For users who prefer the original OpenIXCLab format:

cd ComfyUI/models/sams

# Download using huggingface-cli (recommended)
huggingface-cli download OpenIXCLab/SeC-4B --local-dir SeC-4B

# Or using git lfs
git lfs clone https://huggingface.co/OpenIXCLab/SeC-4B

Details:

  • Size: ~14.14 GB (sharded into 4 files)
  • Precision: FP32
  • Includes all config files in the download

Requirements

  • Python: 3.10-3.12 (3.12 recommended)
  • Python 3.13: Not recommended - experimental support with known dependency installation issues
  • PyTorch: 2.6.0+ (included with ComfyUI)
  • CUDA: 11.8+ for GPU acceleration
  • CUDA GPU: Recommended (CPU supported but significantly slower)
  • VRAM: See GPU VRAM recommendations below
  • Can reduce significantly by enabling offload_video_to_cpu (~3% speed penalty)

Note on CPU Mode:

  • CPU inference automatically uses float32 precision (bfloat16/float16 not supported on CPU)
  • Expect significantly slower performance compared to GPU…

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Routine fork, no traction shown.