Compact Open-Source MiniGPT — Made by Ahmed Walid (AXV)
Jowa-mAi is a lightweight, open-source Large Language Model (MiniGPT) built with PyTorch. It uses an LSTM-based architecture trained on character-level Q&A data fetched from a remote database. Designed to run on low-end hardware with no GPU required.
2-layer LSTM with 512 hidden units and 256-dim embeddings — lightweight yet capable
Fetches Q&A pairs live from GitHub and trains on them using cross-entropy loss + Adam optimizer
Training auto-stops when loss stops improving — saves the best weights automatically
Runs on Windows and Linux — CPU or GPU via CUDA auto-detection
JowaMAIA PyTorch nn.Module with an Embedding layer → 2-layer LSTM → Dropout → Linear output layer. Takes a sequence of character indices and predicts the next character at each step.
No subword tokenizer — the model works at the character level. data_tokens is a list of every supported character (a–z, A–Z, 0–9, symbols, space, newline). Each character maps to its list index.
fetch_remote_data() pulls a JSON file from GitHub. The JSON is a flat dictionary of "question": "answer" pairs used as the training corpus.
Each Q&A pair is formatted as "Q: ...\nA: ...\n" then encoded. The model learns to predict the next character given all previous ones. Uses ReduceLROnPlateau to decay the learning rate and early stopping (patience=10) to avoid overfitting.
Given a user prompt, the model formats it as "Q: {prompt}\nA: ", then generates characters one by one using temperature-scaled softmax sampling until it hits a sentence-ending punctuation or max length.