Was I in a "STEM PhD" Human Data Farm?

Over the holidays I found myself in what I believe was a "STEM PhD" Farm that was harvesting human reasoning data and conducting human subjects research.

Recently, I had an interesting experience that I wanted to share that exemplifies the need I wish to provide to the science and engineering community. I responded to a recruiter message.  The message was an interesting and unusual offer and is as follows:

“Hey Megan, I'm reaching out because we have an exciting, high priority project with a leading AI lab for which I think you'll be a great fit because of your expertise as a domain expert in Astrophysics. About the Role: You will be working on a special project with our client, one of the leading AI Labs. As part of the project, you will be working alongside some of the leading researchers in AI. The role is fully remote and will be asynchronous – you will be able to work on your own time and on top of your current main job. This is a part-time role, with the expected hours being 10-20 hours per week. Pay varies depending on your background and experience. We are looking for people who might be ready to start as soon as this Wednesday, so if you're interested, please apply here and your application would be expedited given that I reached out to you personally. We look forward to having you on the project :) Thanks, Operations Manager”

Author redacted for privacy

I was intrigued by this, and so I decided to engage and see what it was all about. I went through the application process that involved an AI interview.

The onboarding

I then, 24 hours later, received an email with the subject line of “Job Offer: Welcome to [leading AI lab]”, that included a calendar invite that had about 40 other people on it and was titled Astrophysics/ Organic Chemistry Onboarding. I attended the event and the leader representing the leading AI lab and the recruiting company leader explained that we would be going over logistics and getting people set up with Slack, Tailscale, Insightful, gmail addresses that used our names along with the leading AI lab’s domain, and more.

The AI lab leader mentioned that he was passionate about “reasoning”. We were told we would learn more about the project in the final 15 minutes of the call.  At a slightly awkward moment, the recruiting company leader spoke up to playfully explain that she takes her dog to work so if we are seeing something popping in and out of her camera, it's her dog.  No one responded to her, and usually when people show their dog on camera others laugh or send a heart emoji or express that the dog is cute, almost in a ritualistic way.

A disguised “human reasoning” data collection farm?

During the meeting I asked the leaders if the work that we would be doing was to validate the AI models or if they were conducting “human subjects research” and if they would be collecting data about how we “reason”. The AI lab leader laughed and didn’t say yes or no and dismissed the question with a couple of jokes. After they showed us the work we would be doing, people started to ask a few questions like “what if someone changes the database to replace my email address with their email address and take credit for my work?” Another person said “we were told there would be training and mentoring with leading AI researchers, is that still true?” 

These questions were largely dismissed as we were told that they expect people to not act in that way in terms of changing the excel spreadsheet and that they could monitor edit history. They also mentioned that the information regarding mentorship or training was out of date, and asked the person to explain what document that was written on.  With little information about the project and very brief instructions regarding what we would be doing, this group of people I was among, that the leaders were now referring to as “STEM PhDs”, were then told that we should be able to get started with work and that we had everything we needed.

The illusion starts fading...

Someone then asked if we would get payment for the time spent at the meeting that we were on. The recruiting company employee mentioned that now that she knows that we are going to continue working on this after learning more about it in the meeting, she would apply credit to the accounts of the people who were still staying on, to reflect that hour of work of being present at this meeting.  This clued me in that they expected that only a certain amount of the “STEM PhDs” would make it to the end of this funnel.

At this point, the meeting had gone on over an hour, some people had left, and many were bystanders with their cameras off.  We were then told that we must work a minimum of 20 hours a week and that our screens and activities would be monitored during when we were using the tracking app to log our time, and that at the beginning when just getting started with this work, it was ok to take a certain amount of time to work through the problems.

What if scientists and engineers could receive something else and something more for providing their body’s reasoning skills, cultivated over their lifetimes, to be used over and over collectively, by the AI, in service of others?

However, we were told that we would soon be required to keep a certain pace and complete an expected amount of work in the allotted 20 hours a week.  At this point, the Disney+ show “Star Wars: Andor” episodes 8 through 10, regarding the Imperial prison complex, flashed through my mind1. I chuckled, and I realized that I did not have to be there, it was my choice, and I remembered that I was there to learn about what was going on.

It is my belief that because of the nature of the tasks we are asked to complete, that we may be in a farm for human subjects research, collecting data from “STEM PhDs” and other experts, so that this leading AI lab can train their generative AI tool to answer questions at the level of a PhD or expert. The department at this leading AI lab conducting this work is called “Human Data”, so perhaps I should not have been so shocked. I was certainly not prepared for what I was getting into until I was already into it, based on the messages and interactions, the nature of the interview questions, and more, that I had experienced leading up to it. There were even communications that said that after the 1-2 month commitment (over the holidays) that there would be a chance to join the leading AI lab team full time and work with the top AI researchers. 

A perspective on human data collection

Perhaps this is the right way to advance Generative AI technology, and for the right amount of money, there are people who are willing to provide their human data to this process.  Somewhat frightening, however, are the people who have been lured into providing their body’s data, without pause or awareness about how this impacts them. Recently, Elon Musk announced that all human data for AI training has been exhausted2 .

But what if we do this differently? What if scientists and engineers could receive something else and something more for providing their body’s reasoning skills, cultivated over their lifetimes, to be used over and over collectively, by the AI, in service of others?

As one example, what if they could be credited similar to an artist or musician, with rights to how their unique body’s reasoning skills contribute to what generative AI creates for someone? What would that require for us to know about these models? Do we need to know by how much a particular model’s training data contributed to a particular served response? Could we then allocate cost and payment credit to the people whose reasoning skills contributed to that response, in that way? After all, if the AI learns to copy how an individual and/or group “reasons”, they are obtaining how that scientist, engineer, doctor, lawyer etc. is, in a way, creating.  

I believe that this moment in time carries an important opportunity to ask the right questions around how we support and reward each other in science and engineering spaces and beyond.

Concluding thoughts

This is just one example of what I want to consider when developing AI technology and tools. I believe that by creating human support technology grounded in and evaluated by its ability to contribute to human health and its ability to support vibrant organizational cultures, new AI technology will enhance our capacity as humans to take solid actions and it will expand our perspectives.

By further enabling scientific anticipation through leveraging built foundational knowledge, we can carry out the scientific process in an efficient manner. I believe that this moment in time carries an important opportunity to ask the right questions around how we support and reward each other in science and engineering spaces and beyond. These questions carry with them values that aren’t just pillars for contributing to global technology informed governance, but values that are morally sacrosanct. 

1  Stay tuned for my future article where I dive deeper into my interpretation of why these Imperial prison complex episodes from the Disney+ show “Andor” flashed through my mind.

2  https://www.theguardian.com/technology/2025/jan/09/elon-musk-data-ai-training-artificial-intelligence