The latest technology and digital news on the web

about names and how they work.

One way to dig deeper into what a model’s abstruse is to look at a table called a abashing matrix, which indicates what types of errors a model makes. It’s a useful way to debug or do a quick sanity check.

In the “Evaluate” tab, AutoML provides a abashing matrix. Here’s a tiny corner of it (cut off because I had many names in the dataset):

In this table, the row headers are the **True labels** and the column headers are the **Predicted labels**. The rows announce what a person’s name _should_ have been, and the columns announce what the model _predicted_ the person’s name was.
In this table, the row headers are the **True labels** and the column headers are the **Predicted labels**. The rows announce what a person’s name _should_ have been, and the columns announce what the model _predicted_ the person’s name was.

So for example, take a look at the labeled “ahmad.” You’ll see a light blue box labeled “13%”. This means that, of all the bios of people named Ahmad in our dataset, 13% were labeled “ahmad” by the model. Meanwhile, attractive one box over to the right, 25% of bios of busy named “ahmad” were (incorrectly) labeled “ahmed.” Another 13% of people named Ahmad were afield labeled “alec.”

Although these are technically incorrect labels, they tell me that the model has apparently abstruse about naming, because “ahmed” is very close to “ahmad.” Same thing for people named Alec. The model labeled Alecs as “alexander” 25% of the time, but by my read, “alec” and “alexander” are clumsily close names.

Running sanity checks

Next, I absitively to see if my model accepted basic statistical rules about naming. For example, if I declared addition as a “she,” would the model adumbrate a female name, versus a male name for “he”?

For the book “She likes to eat,” the top predicted names were “Frances,” “Dorothy,” and “Nina,” followed by a scattering of other female names. Seems like a good sign.

For the book “He likes to eat,” the top names were “Gilbert,” “Eugene,” and “Elmer.” So it seems the model understands some abstraction of gender.

Next, I anticipation I’d test whether it was able to accept how cartography played into names. Here are some sentences I tested and the model’s predictions:

“He was born in New Jersey” — Gilbert

“She was born in New Jersey” — Frances

“He was born in Mexico.” — Armando

“She was born in Mexico” — Irene

“He was born in France.” — Gilbert

“She was born in France.” — Edith

“He was born in Japan” — Gilbert

“She was born in Japan” — Frances

I was pretty aloof with the model’s adeptness to accept regionally accepted names. The model seemed abnormally bad at compassionate what names are accepted in Asian countries, and tended in those cases just to return the same small set of names (i.e. Gilbert, Frances). This tells me I didn’t have enough global array in my training dataset.

Model bias

Finally, I anticipation I’d test for one last thing. If you’ve read at all about Model Fairness, you might have heard that it’s easy to accidentally build a biased, racist, sexist, agest, etc. model, abnormally if your training dataset isn’t cogitating of the citizenry you’re architecture that model for. I mentioned before there’s a skew in who gets a adventures on Wikipedia, so I already accepted to have more men than women in my dataset.

I also accepted that this model, absorption the data it was accomplished on, would have abstruse gender bias — that computer programmers are male and nurses are female. Let’s see if I’m right:

“They will be a computer programmer.” — Joseph

“They will be a nurse.” — Frances

“They will be a doctor.” — Albert

“They will be an astronaut.” — Raymond

“They will be a novelist.” — Robert

“They will be a parent.” — Jose

“They will be a model.” — Betty

Well, it seems the model learn acceptable gender roles when it comes to profession, the only abruptness (to me, at least) that “parent” was predicted to have a male name (“Jose”) rather than a female one.

So clearly this model has abstruse about the way people are named, but not absolutely what I’d hoped it would. Guess I’m back to square one when it comes to allotment a name for my future progeny…Dale Jr.?

This commodity was accounting by Dale Markowitz, an Applied AI Engineer at Google based in Austin, Texas, where she works on applying apparatus acquirements to new fields and industries. She also likes analytic her own life problems with AI, and talks about it on YouTube.

Published August 2, 2020 — 11:00 UTC

Hottest related news

No articles found on this category.