Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

YatharthS

posted an update 3 days ago

Post

3127

🤯 🤯 Released a high quality finetuned LLM based TTS model that can generate realistic and clear 48khz audio at over 100x realtime speed! 🤯 🤯

Github link: https://github.com/ysharma3501/MiraTTS

Model link: https://github.com/ysharma3501/MiraTTS

Blog explaining llm tts models: https://huggingface.co/blog/YatharthS/llm-tts-models

3 replies

branikita

posted an update 1 day ago

Post

1022

Next week, we will release full documentation for the SO ARM 101 with a parallel gripper, featuring leader and follower arms and support for widely used stereo cameras.

daavoo

posted an update 1 day ago

Post

624

2025: The Year of Agents.
2026: The Year of Local Agents?

Relying on cloud-hosted LLMs is often overkill. While frontier models still lead in complex coding, local models are now more than capable of handling many agentic workflows—with zero latency and total privacy.

To help bridge the gap between local inference and usable agents, I’m releasing agent.cpp: https://github.com/mozilla-ai/agent.cpp

It provides minimal, high-performance building blocks for agents in C++, built directly around the awesome llama.cpp ecosystem.
Stop sending your data to a remote API. Start building and running agents on your own hardware.

1 reply

danielhanchen

posted an update 2 days ago

Post

1607

Google releases FunctionGemma, a new 270M parameter model that runs on just 0.5 GB RAM.✨

Built for tool-calling, run locally on your phone at 50+ tokens/s, or fine-tune with Unsloth & deploy to your phone.

GGUF: unsloth/functiongemma-270m-it-GGUF
Docs + Notebook: https://docs.unsloth.ai/models/functiongemma

1 reply

victor

posted an update 3 days ago

Post

2416

Nvidia is on a roll lately. Nemotron 3 Nano is my new fav local model, but here's the real flex: they published the entire evaluation setup. Configs, prompts, logs, all of it. This is how you do open models 🔥

https://huggingface.co/blog/nvidia/nemotron-3-nano-evaluation-recipe

etemiz

posted an update about 24 hours ago

Post

965

looks like the best way to incorporate truth in AI is to use some kind of RAG.
what are the state of the art ways to consume knowledge graphs?
and what is the best way to build a knowledge graph using AI?

2 replies

Nymbo

posted an update 2 days ago

Post

1429

🚨 New tool for the Nymbo/Tools MCP server: The new Agent_Skills tool provides full support for Agent Skills (Claude Skills but open-source).

How it works: The tool exposes the standard discover/info/resources/validate actions. Skills live in /Skills under the same File_System root, and any bundled scripts run through Shell_Command, no new infrastructure required.

Agent_Skills(action="discover")  # List all available skills
Agent_Skills(action="info", skill_name="music-downloader")  # Full SKILL.md
Agent_Skills(action="resources", skill_name="music-downloader")  # Scripts, refs, assets

I've included a music-downloader skill as a working demo, it wraps yt-dlp for YouTube/SoundCloud audio extraction.

Caveat: On HF Spaces, Shell_Command works for most tasks, but some operations (like YouTube downloads) are restricted due to the container environment. For full functionality, run the server locally on your machine.

Try it out ~ https://www.nymbo.net/nymbot

prithivMLmods

posted an update 1 day ago

Post

717

Introducing TRELLIS.2 Text-to-3D. The demo for the TRELLIS.2-4B (Image-to-3D) model is streamlined with the Z-Image Turbo image generation model to enable Text-to-3D functionality. There is no need for input assets, making a small leap forward for ideation. Optionally, it also includes default support for Image-to-3D inference using direct image assets. Find the demo and related collections below... 🤗🔥

✨ TRELLIS.2-Text-to-3D [Demo]: prithivMLmods/TRELLIS.2-Text-to-3D
✨ Multimodal Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ Github: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D

To know more about it, visit the app page or the respective model page!

sergiopaniego

posted an update 2 days ago

Post

1577

Google DeepMind releases FunctionGemma, a 240M model specialized in 🔧 tool calling, built for fine-tuning

TRL has day-0 support. To celebrate, we’re sharing 2 new resources:

> Colab guide to fine-tune it for 🌐 browser control with BrowserGym OpenEnv
> Standalone training script

> Colab notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_functiongemma_browsergym_openenv.ipynb
> Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/browsergym_llm.py (command to run it inside the script)
> More notebooks in TRL: https://huggingface.co/docs/trl/example_overview#notebooks

codelion

posted an update about 15 hours ago

Post

607

Introducing PTS Visualizer - an interactive tool for exploring how language models reason!

Visualize pivotal tokens, thought anchors, and reasoning circuits. See which tokens and sentences significantly impact success probability, explore embedding clusters, and trace reasoning step-by-step.

Try it: codelion/pts-visualizer

Explore PTS datasets:
- Qwen3-0.6B: codelion/Qwen3-0.6B-pts
- DeepSeek-R1: codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts

Or upload your own JSONL files!

GitHub: https://github.com/codelion/pts

Recently active users