Jopara Vibe

Real language, captured as it is spoken.

Guaraní and Jopará are complex, dynamic, and alive. We treat them as such.

Current focus: Guaraní and Jopará

Guaraní and Jopará are not standardized systems.

They are shaped by region, context, and daily use.

What is spoken is not always what is written.

What is written is often not how people actually speak.

Most digital systems are not built for this kind of language.

They assume consistency, structure, and standardization.

Our work focuses on capturing real speech — in its variation, its inconsistency, and its context.

Not as a simplified version. But as it actually exists.

Why this matters

Language does not disappear. But it can become invisible in the systems that shape everyday life.

Most digital environments are built around a limited set of languages.

When language is not represented, it is not recognized.

When it is not recognized, it is not supported.

This affects how people interact with technology, what they can access, and how their language is carried forward.

Language is not just communication. It is context, identity, and continuity.

System

A structured linguistic dataset forms the foundation. Controlled recording environments ensure acoustic quality. The app refines the dataset through real-world input over time.

Foundation

A structured linguistic dataset forms the foundation.

Relationships, context, and meaning are organized at scale — across phrases, domains, registers, speakers, and validation states.

ID	Guaraní	Spanisch_Gold	Contexto	Intención	Tono	Dominio	Registro_guarani	Validado_Silva	IPA_auto	Split	Schwierigkeit_CEFR
GDS-0002	Che aime porã.	Estoy bien.	respuesta a saludo	responder	informal	cotidiano	jopará	no	/ ʃe aime poɾã /	train	A2
GDS-0003	Aguyje.	Gracias.	interacción social	expresar	amable	cotidiano	guaraní_puro	no	/ aɣuɨdʒe /	test	A1
GDS-0004	Che aikotevẽ pytyvõ.	Necesito ayuda.	petición directa	expresar	directo	cotidiano	jopará	no	/ ʃe aikoteʋẽ pɨtɨʋõ /	train	B1
GDS-0005	Ko'ẽrõ aháta.	Mañana iré.	planificación	describir	neutro	cotidiano	jopará	no	/ koʔẽɾõ ahata /	train	A2
GDS-0006	Mba'épa nde réra?	¿Cómo te llamas?	presentación	preguntar	informal	cotidiano	jopará	no	/ mbaʔepa nde ɾeɾa /	train	B1
GDS-0007	Che réra Juan.	Me llamo Juan.	respuesta a identificación	responder	informal	cotidiano	jopará	no	/ ʃe ɾeɾa dʒuan /	train	A1
GDS-0008	Moõguipa nde reju?	¿De dónde venís?	presentación	preguntar	informal	cotidiano	jopará	no	/ moõɣuipa nde ɾedʒu /	train	B1
GDS-0009	Che aju Paraguay-gui.	Vengo de Paraguay.	respuesta a origen	responder	neutro	cotidiano	jopará	no	/ ʃe adʒu paɾaɣuaɨ-ɣui /	train	A2
GDS-0010	Jajoechapeve.	Nos vemos.	final de conversación	expresar	informal	cotidiano	guaraní_puro	no	/ dʒadʒoeʃapeʋe /	train	A1
GDS-0011	Che hasy.	Estoy enfermo/a.	salud	expresar	neutro	médico	jopará	no	/ ʃe hasɨ /	train	A1
GDS-0012	Eju ko'ápe.	Ven aquí.	instrucción directa	ordenar	directo	cotidiano	guaraní_puro	no	/ edʒu koʔape /	train	A2
GDS-0013	Ani rejapo upéva.	No hagas eso.	advertencia	ordenar	directo	cotidiano	guaraní_puro	no	/ ani ɾedʒapo upeʋa /	train	A2
GDS-0014	Epyta michĩmi.	Esperá un momento.	espera	ordenar	neutro	cotidiano	guaraní_puro	no	/ epɨta miʃĩmi /	train	A2
GDS-0015	Ehendu.	Escuchá.	comunicación	ordenar	directo	cotidiano	guaraní_puro	no	/ ehendu /	dev	A1
GDS-0016	Che avy'a.	Estoy feliz.	expresión personal	expresar	informal	cotidiano	guaraní_puro	no	/ ʃe aʋɨʔa /	train	A2
GDS-0017	Che ndavy'ái.	No estoy feliz.	expresión personal	expresar	informal	cotidiano	guaraní_puro	no	/ ʃe ndaʋɨʔai /	train	A2
GDS-0018	Che kane'õ.	Estoy cansado/a.	estado físico	expresar	neutro	cotidiano	guaraní_puro	no	/ ʃe kaneʔõ /	train	B1
GDS-0019	Che karu.	Estoy comiendo.	actividad cotidiana	describir	neutro	cotidiano	guaraní_puro	no	/ ʃe kaɾu /	train	A1
GDS-0020	Che a'u.	Estoy bebiendo.	actividad cotidiana	describir	neutro	cotidiano	jopará	no	/ ʃe aʔu /	train	A2
GDS-0021	Che ake.	Estoy durmiendo.	estado físico	describir	neutro	cotidiano	guaraní_puro	no	/ ʃe ake /	train	A1
GDS-0022	Che apu'ã.	Me levanto.	rutina diaria	describir	neutro	cotidiano	guaraní_puro	no	/ ʃe apuʔã /	train	B1
GDS-0023	Che aha.	Me voy.	despedida	describir	neutro	cotidiano	guaraní_puro	no	/ ʃe aha /	dev	A1
GDS-0024	Eju hag̃ua.	Vení, por favor.	invitación	ordenar	amable	cotidiano	guaraní_puro	no	/ edʒu haŋua /	train	A1
GDS-0025	Eguapy.	Sentate.	instrucción directa	ordenar	directo	cotidiano	guaraní_puro	no	/ eɣuapɨ /	train	A1
GDS-0026	Epu'ã.	Levantate.	instrucción directa	ordenar	directo	cotidiano	guaraní_puro	no	/ epuʔã /	train	A2

Structured export layer train.jsonl

{
  "id": "GDS-0011",
  "source_lang": "gn",
  "target_lang": "es",
  "source": "Che hasy.",
  "target": "Estoy enfermo.",
  "ipa": "/ ʃe hasɨ /",
  "domain": "médico",
  "register": "jopará",
  "cefr": "A1",
  "validation": "reviewed",
  "split": "train",
  "export_ready": "true"
}

12 sheets Dataset surface

59 fields Per entry

7 domains Coverage

IPA · CEFR · Split Annotation layers

Studio visual · to be placed

Controlled

Controlled recording environments ensure consistency and acoustic quality.

Studio conditions provide clarity, control, and reliable source material — the quality layer the rest of the system rests on.

Real-world layer

The app refines the dataset through real-world input.

Everyday speech, structured contributions, and audio capture extend the dataset over time — the app is the refinement layer, not the foundation.

Jopara Vibe feed showing a Guaraní phrase with cultural context.

Captured

Real speech, with its meaning intact.

Each phrase arrives with translation and cultural context — preserved in the form it is actually used, not a simplified or standardized version.

Contributed

Structured contribution, not open submission.

Each entry is shaped by fields: phrase, translation, context, category. The structure is what turns everyday language into material the dataset can absorb.

Submission interface for contributing a Guaraní phrase with translation, context, and category.

The values system: Mbarete, Pytyvõ, Mbojerovia — strength, mutual help, and trust, drawn from Guaraní.

Framed

Built around principles from the language itself.

Mbarete, Pytyvõ, Mbojerovia — strength, mutual help, trust. The system is shaped by values drawn from the culture it serves, not imposed on it from outside.

Extended

Audio capture extends the dataset with acoustic material.

Selected contributors record guided phrase sets. These feed the speech layer that underpins recognition and synthesis over time.

Audio capture interface with a guided set of phrases to record.

No input moves directly into training.

Every entry passes through a structured validation pipeline before it becomes part of the dataset.

Access

Access to the current system is controlled.

Participation is limited to selected contributors and structured programs.

This ensures consistency, data quality, and a reliable foundation for the systems being developed.

The system is not designed for open, unsupervised input.

It is built through guided collection and controlled environments.

This is where it starts.

Contact

For partnerships and inquiries:

contact@joparavibe.com