Saturday, February 03, 2024

I Am He

I have not checked off a box that affirms I am not a robot. I have not picked out which photos contain buses. I have not read and retyped a sequence of squiggly letters. And yet, because of two obscure statistical measures, you can rest assured that this is me. That's because according to an analysis of this writing, and let me quote here so I get it right, "There is a 0% probability this text was entirely written by AI. This text is most likely to be written by a human." And that be me.

One of the great challenges of our times is and will be determining what is real and what is not. It used to be fairly easy, as the underlying systems to generate text, pictures and speech were not that sophisticated, and fakes were obvious. It was like justice and innocence: we presumed it was human generated until proven otherwise. But with the advent of generative artificial intelligence and the popularization and easy access to tools like ChatGPT and Bard, the balance has shifted. While you assume that what you are seeing is real, you now generally look at everything with at least a healthy dose of suspicion.

The question becomes how to determine what is created by machines versus by people. Researchers are working on ways of validating the end product, enabling viewers to verify that an actual person was the creator, and that what they are presenting is real as opposed to manufactured. For sure that will become more of an issue in the future as the systems get more powerful and the fakes get better. But at least for now, there are a number of "tells" that give away the answer if you are willing to look. 

With pictures, there are several. If you look closely, you will likely see little artifacts: a misshapen ear, an odd arrangement of hair, a strange reflection. That's because the composites are created by taking segments of unrelated images and recombining them, and it's not always so seamless. Backgrounds can be blurry, but so blurry that when you look closely you see they really are not comprised of anything other than shapes and colors. And because real life is filled with imperfections, anything that looks too smooth is likely to be fake, or just a Kardashian.

With writing there are similar tests. The two markers that pop up used to be obscure statistical measurements, but have been repurposed to tell a real Hemingway from a fake Ernest. The first, called perplexity, is "a measure of uncertainty in the value of a sample from a discrete probability distribution." In plain speak, that means how likely are you to be able to guess the next word in a sentence. A low incidence indicates it might be more machine generated vs by a person. And then there's burstiness, which is defined as "the intermittent increases and decreases in activity or frequency of an event." In terms of writing, it means using or not using a word or term in "bursts." Humans generally do it, machines generally do not. 

For both pictures and words, detection programs exist to help screen for the aforementioned anomalies. Drop an image into AIorNot.com, and it will render a judgment as to real or fake. Run your text through GPTZero, and it will analyze it and give you a thumbs up or down as to machine or human. As an experiment, I ran several recent columns through the program, and got the reported results. It's worth noting that while a number of the metrics it calculated for my stuff were very middle of the road, my burstiness measure, on a scale of 0 to 100, was between 250 and 400. Ain't no machine gonna wield words like this human, for better or worse.

And so I come to you as flesh and blood, and I have the data to back me up. That said, there is a caveat. Remember the second part of that initial analysis? "This text is most likely to be written by a human." A little hedging going on there. After all, as has been noted elsewhere, the odds of being murdered by a chicken are extremely low, but never zero. So note that qualifier "most likely." However you can rest assured that, like the Beatles sang, I am he. You'll just have to trust me on this one.

-END-

Marc Wollin of Bedford swears he writes every word herein himself. His column appears weekly via email and online at http://www.glancingaskance.blogspot.com/, as well as via Facebook, LinkedIn and Twitter.


No comments: