I gave GPT-4 eyes. Here’s what I did: - added some data to a vision model - gave the AI camera access - asked it questions about the scene - it identified objects - it searched web for info - used that info to accurately answer Watch it get 3 questions 100% correct!
And just for clarification the vision stuff isn’t GPT-4’s work. It can’t access your camera. I hooked up a *separate* vision model to it that handles the camera stuff. Paired the 2 for the idea.
@mckaywrigley This is hyperbole marketing. Why not stay focused on your startup.
@mckaywrigley Which vision models are you using?
@mckaywrigley Can you archive all these data with timestamps and then every 15 seconds have GPT look into it and summarize what happened in the scene within these 15 seconds? Then, the summary can go into it's "memory". Repeat. Then you ask after 1 minute: "what happened in the last minute?"
@mckaywrigley Be curious what the prompt looked like — is it all the detected objects?
@mckaywrigley Stunning! What an adventure this AI is. Have you shared any of your cobbling methods, IDE, Git repo, or languages? I'm very curious indeed. I'm using VSCode, GitHub, and learning the ways of Python. Thinking I'd like to start tinkering with it all myself. Leg up?