Question about output embedding vector of ModernBERT
#12
by
meaningful96
- opened
Are the output CLS and token embedding vectors L2 normalized on a per-token basis?
Are the output CLS and token embedding vectors L2 normalized on a per-token basis?