Overview
The current generative AI boom (LLM) was achieved by learning from vast amounts of text data on the web, but in the "organism (natural world)" domain, there is still no dataset of sufficient quality and quantity for learning. IKIMON aims not just to run an app, but to build the "world's highest quality organism image dataset" and create a Japan-originated biodiversity-specialized foundation model (Large Nature Model: LNM). This is essential infrastructure to create a world where children of the future can carry around a "pocket expert" with just a smartphone.Current Challenges: Why is Current AI Weak on Organisms?
1. The "Long Tail" Wall (Data Bias)
Even in the world's largest iNaturalist dataset, there are tens of thousands of images of "common species" like sparrows and cabbage white butterflies, but only a few images (or zero) of endangered species or obscure insects. AI is good at things with lots of data but cannot learn things with little data, resulting in "the rarer the species, the more it's ignored (misidentified)"—the very situation most to be avoided in conservation.2. The "Misidentification Reproduction" Cycle
Many apps have introduced "AI automatic suggestions," but cases where beginners register as-is thinking "if AI says so, it must be right" are frequent.- Incorrect data is registered as "correct" in the DB
- The next AI learns from that incorrect data
- Misidentification is reinforced, and nobody can notice the mistakes
3. Lack of Context
Many current image recognition AIs are trained on "clean photos of adults." However, in actual nature observation:- "Out of focus," "rear view," "partial only"
- "Larvae," "eggs," "molted shells," "droppings," "footprints"
4. Spatial Bias and the "Luxury Effect"
Citizen science data has a strong bias called the "Luxury Effect"—"the wealthier the urban area, the more data is collected."- Reason: Wealthy areas have more green space (more organisms), and residents have "time and mental margin" to observe with smartphones.
- Problem: AI training data becomes biased toward "creatures in urban parks," and data from original habitats in "rural mountainous areas (depopulated areas)" is not learned, making the AI a "city kid."
IKIMON's Strategy: Japan as a "Data Sanctuary"
1. Leveraging Japan's "Photography Power"
Japan has many "amateur photographers and naturalists" at extremely high levels by global standards. The photos they take are as high-definition as academic specimens and artistic. IKIMON aims to build a "museum-grade dataset" with low noise, leveraging "Japan's photography skills." An approach emphasizing "quality" not just quantity.2. Validation First
AI suggestions are kept as just "assistance," and a flow is established where "trustworthy human eyes (experts, experienced users)" must always be involved to confirm data. By attaching a trust score of "who identified it" to data, filtering like "use only reliability-A images" becomes possible during AI training, preventing mislearning.3. "Life Cycle" Dataset
We create AI that understands not just "species name" but the "state" of that organism.- Multi-stage Learning: Label and train each stage from egg → larva → pupa → adult.
- Field Sign Learning: Droppings, feeding marks, burrows—traces "other than the organism itself" also become training targets.
Future Vision: Building the Large Nature Model (LNM)
What IKIMON aims for is a "multimodal foundation model of the natural world" combining images and language.
- Input: A photo of a "mystery larva" taken with a smartphone, plus GPS information (location, time).
- AI's Thinking
- Output: "This is a Papilio maackii larva. It will soon become a chrysalis. Is there an Amur Cork Tree nearby?"
The World IKIMON Aims For
"I'm not an expert, so I don't understand nature."
We want to break that wall with technology.
Photograph a bug found during a walk. AI tells you "maybe this." An expert says "that's right." Data gathered this way becomes valuable scientific data to protect Japan's nature.
A society where everyone can be a "discoverer." That is the future IKIMON wants to create.