Neat. Any reason why the MCP server doesn't expose a JavaScript/eval tool? Current models excel at writing JS to drive and inspect the DOM. They aren't great at driving browsers via screenshots.
FWIW, if you have Claude Code or the like, you can quickly prompt your way to an eval function in MCP. It already exists in clicker and the client API. You can use it to get the accessibility tree, for example, and use that to find what to fill out and click.
> why the MCP server doesn't expose a JavaScript/eval tool?
no reason other than my number #1 goal was "ship something". i only started the actual coding on dec 11. it's been a bit of a sprint the last two weeks!
though "image-based" vs "dom-based" testing approaches is a very big topic! (look forward to researching that more in the future.)
Create a markdown file, for each SKILL.md of the skills you want to use, put the frontmatter in that single markdown file along with the fulk path to the SKILL.md file. On session start, tell Gemini to read that file. If you put it in your AGENTS.md, you don't have to instruct Gemini. And if you have your skills in a known folder, let Gemini write a small scripts that generates that markdown file for you.
Oh, I didn't intend this to come across as MCP being useless. I've written this from the perspective of someone who uses LLMs mostly for coding/computer tasks, where I found MCP to be less than ideal for my use cases.
I actually think MCP can be a multiplier for non-technical users, where it not for some nits like being a bit too technical and the various security footguns many MCP servers hand you.
Also not disagreeing with your argument. Just want to point out that you can achieve the same by putting minimal info about your CLI tools in your global or project specific CLAUDE.md.
The only downside here is that it's more work than `claude mcp add x -- npx x@latest`. But you get composability in return, as well as the intermediate tool outputs not having to pass through the model's context.
I run a few production RAG systems, some as old as end of 2023 and arrived at the same conclusions.
Query expansions and non-naive chunking give the biggest bang for the bug, with chunking being the most resource intensive task, if the input data is chunk (pun intended).
I love this! Not just because I also grew up in the 90ies and like your music choice :)
As we drown in media and slop, I think it's super important to teach kids how to be selective, develop taste. And I too found that physical connection does help with that.
Great project and execution. It would be great if you could also introduce a social aspect, i.e. kids sharing/swapping cards.
Chrome dev console has Gemini integrated as well. Otherwise pick any coding agent (Claude Code, Codex, opencode, ...) give it the Playwright MCP and ask away.