《人工智能和机器学习需要包容性》--哈尔滨安天科技集团股份有限公司提供

2018-08-08

       如果企业在应用预测分析、人工智能和机器学习方面具有前瞻性思维,那么他们可以使这些技术具有包容性,以便所有人都能使用。

试想一下,如果一个在设计时没有考虑你或者你这样的人的产品主宰了你的日常交互,控制着向你推销什么产品、你如何使用(或不使用)某些产品,影响你与执法机构的交互,甚至决定你的医疗诊断和医疗决策。

人工智能(AI)和机器学习(ML)的核心问题正在凸显。AI算法本质上是嵌入在代码中的意见。在构思、测试和实施过程中,AI可能由于未能包括广泛的观点从而产生、支持或加剧偏见。年龄歧视、残疾歧视、性别歧视和种族歧视正在被嵌入到一些作为“智能”系统基础的服务中,这些系统决定我们被如何分类、营销、服务或歧视。

ML输出与重复的数据、图片和文字关联输入没什么两样。如果这些输入是由一小组同质工程师或产品经理选择的,且没有这个小组之外的人进行细化或审查,则输出结果会有偏差,几乎无法检查底层的逻辑。这意味着训练数据的选择需要涵盖复杂的(而不是单一的)观点。如果这些技术的样本很小且一致,则算法将会从不完整的数据、图片和文字中继承偏差并加强这种偏差。

同质团队、测试和数据会产生问题。以简单的照片识别系统为例,2015年夏天,Google照片应用程序将两张黑人的照片标记为了大猩猩。

还有一些AIML基于数据驱动的见解开发,如果这些数据不够包容,可能会造成破坏性的,甚至危及生命的后果。今年早些时候,澳大利亚阿德莱德大学(University of Adelaide)进行了一项研究,48名参与者都是60岁以上的老人,研究人员使用人工智能技术分析参与者的器官照片,预测哪些人将在5年内死亡,准确率达69%。虽然以这样的准确度进行预测的能力是惊人的,但在分析少量样本时要吸取教训。

这些信息可用于ML目的,帮助制定医疗决策并提供建议。但还有个关键问题:参与者是什么种族?这是至关重要的,因为某些种族罹患某些疾病的几率较高,例如黑人女性罹患肌瘤的几率较高,拉丁美洲人群更可能患糖尿病和肝病。如果数据集不包括这类信息,则可能导致不同的治疗结果,包括误诊、无法做出诊断或治疗效果不佳。

AI和ML能够辅助的另一个领域是公共安全和警方执法机构。预测软件(如Hunchlab)使用历史犯罪数据、月相、地点、人口普查数据,甚至专业运动队的日程表来预测犯罪发生的时间和地点。问题在于,历史犯罪数据可能基于不成比例的针对性警方行动,例如主要针对低收入地区的年轻黑人。司法部在密苏里州弗格森的调查结果和旧金山调查结果证实了这一点。如果历史数据显示大多数犯罪和逮捕发生在贫困的少数民族地区,预测技术将加强这些数据,导致这些地区的过度治理持续下去。

智能应用程序也可以决定你如何、何时以及为什么花钱。广告公司正在测试IBM WatsonAI功能,以便为客户推荐感兴趣的产品。例如,通过理解用户在对话中表现出的个性、语气和情绪来实现个性化的推荐——如果这些推荐不会与身份分析混为一谈,那还真是不错。要避免一个可能的结果:应用程序根据你的声音模式确定你更可能来自洛杉矶中南部而非比佛利山庄,因此只为你提供黑人学院和大学的信息。

我们应避免这种情况,创建一个更具包容性的AI,以下是一些能够提供帮助的措施。

组建多元化的团队。如果核心团队中没有能够代表不同类型的客户的成员,你是否可以在其他团队中找到?从其他职能团队中引入多元化员工可以提供更多的视角。如果在公司内部仍然无法找到具有代表性的员工,那么将客户带入开发周期如何呢?你可以在创建用户故事的过程中获得他们的反馈,或让他们参与验收演示。

GoDaddy在创建名为GoCentral的自己动手网站构建器时得到了这一教训。GoCentral包含成千上万的图片,用户可以查询并选择几个相关的图片构建小企业网站的第一个版本。当我们在美国推出这项服务时,几乎所有的“人”图片都显示高加索人的特征。我们的客户服务团队和早期客户立即指出,这不适用于向少数族裔出售的网站。此外,我们计划将产品推向全球,因此只显示白种人的面孔并不具有代表性。团队放弃了这些图片,引入了更多样化的图片。

找到多样化的训练数据集。如果你的目标是为广大受众提供服务,那么扩展数据来源以获得每个细分受众群的足够数据至关重要。随着语音命令用户界面的使用越来越多,有许多人因为命令不被理解而感到沮丧。为什么呢?因为他们有口音。语音服务的早期训练数据大部分来自使用中性口音(如广播员的口音)收集数据的研究人员和大学生(同质群体)。Google的语音识别服务正在扩展训练数据集,为解决导致偏见和服务无法使用的狭窄数据源问题树立了榜样。

GoDaddy也发现,我们需要明确获取足够深度的客户数据,以便涵盖每个群体。举例来说,我们的域名搜索建模团队最初使用我们的美国模式来决定在其他国家显示哪些域名,但是发现它比非智能模式更加糟糕。为什么呢?在相同的模式下,其他国家的客户的表现相当不同。直到我们把具体的国家分成不同的模式时,才看到了明显的改善。

法律的制定滞后于技术的发展,后者推动创新,而前者在通过立法或制定政策之前要先看看结果如何。如果企业在应用预测分析、人工智能和机器学习方面具有前瞻性思维,那么他们可以在不需要新的法律法规的情况下使这些技术具有包容性。所有公司,不管规模大小,都有明确的步骤把这项技术带给所有人。这始于团队中的某个人的疑问:“我们是否在构建一个涵盖所有人的未来愿景?”


The Need for Inclusion in AI and Machine Learning

https://www.informationweek.com/big-data/ai-machine-learning/the-need-for-inclusion-in-ai-and-machine-learning/a/d-id/1330464?

11/21/2017
04:00 PM

Steven Aldrich and B�r� A. Williams

If companies are forward-thinking in their application of predictive analytics, AI, and machine learning they can make these technologies inclusive, and available and relevant to all people.

Imagine if something not designed with you or anyone like you in mind was the driving force of how regular interactions permeate your life. Imagine it controls what products are marketed to you, how you can use certain consumer products (or not), influences your interactions with law enforcement, and even determines your health care diagnoses and medical decisions.

There are problems brewing at the core of artificial intelligence and machine learning (ML). AI algorithms are essentially opinions embedded in code. AI can create, formalize, or exacerbate biases by not including diverse perspectives during ideation, testing, and implementation. Ageism, ableism, sexism, and racism are being built into some services that are the foundation of many “intelligent systems that shape how we are categorized, advertised to, and serviced, or disregarded.

ML output is only as good as the repetitive data, pictures, and word association inputs. If those inputs are chosen by a small, homogenous group of engineers or product managers, with no refinement or review outside of that group, the output can create biased results with little ability to check the underlying logic. This implies that the selection of training data needs to give a complex (instead of a singular) view of history, which will inform the future. If the sample size for these technologies is small and uniform, algorithms will adopt and reinforce biases gleaned from incomplete data trends, pictures, and words.

We can see recent examples of this when homogeneous teams, testing, and data create problems. It appears in simple photo recognition systems, such as Google’s mishap during summer 2015, when its Photo app tagged two pictures of Black people as gorillas.

There are also examples of AI and ML being developed on top of data-driven insights, that if not inclusive could actually have damaging, and possibly life threatening, results. Earlier this year, the results of a University of Adelaide (Australia) study of 48 participants, all of whom were at least 60 years old, yielded 69 percent accuracy when predicting who would die within five years from the analyzed photos of people’s organs using artificial intelligence. While the ability to predict with such accuracy is phenomenal, there is a lesson to be learned when analyzing the small group of participants.

One can see how this information could be used for ML purposes to inform healthcare decisions and provide counsel and care. But key questions linger. What were the ethnicities of the participants? This is vital information, as certain ethnicities have greater cases of certain diseases and ailments, like black women have a higher instance of fibroids, and Latinx populations are more likely to have diabetes and liver disease. If the data sets that inform the technology do not include this kind of information, this could lead to disparate outcomes in health care treatment, including misdiagnoses, no diagnoses, or poor treatment plans.

Another sector where AI and ML can inform is public safety and police enforcement. Predictive policing applications, like Hunchlab, use historical crime data, moon phases, location, census data, and even professional sports team schedules to predict when and where crime will occur. The problem with that is the foundational element of historical crime data can be based upon policing practices that disproportionately target, for example, young black and brown men, primarily in low income areas. These practices have been confirmed via the Justice Department findings in Ferguson, Missouri, and recent task force findings in San Francisco. If historical data shows most crime and arrests are in poor minority areas, predictive technology will reinforce that data, thus perpetuating a cycle of over-policing in poor minority areas.

Applications imbued with intelligence can also determine how, when, and why you spend your money. Advertising agencies are testing IBM Watson’s AI capabilities to target offers for their clients' products. For example, a campaign personalizes recommendations by understanding a users personality, tone, and emotion conveyed in a conversation all of which sounds great if the recommendations dont blur into profiling. A hypothetical outcome to avoid: The application decides based on your vocal patterns that youre more South Central Los Angeles than Beverly Hills and provides only information on historically black colleges and universities.

Let’s avoid this future and instead create an inclusive one. Here are practical steps to help.

Tap into a diverse team. If you don’t have representation from the different types of customers you serve (or want to serve) on the core team, can you find it on the extended team? Bringing in a diverse set of employees from other functions gives additional perspective. If you still cant find a representative sample inside your company, how about bringing customers into the development cycle. You can get their feedback during user story creation or have them sit in on acceptance demos.

GoDaddy learned that lesson in building a recent product, a do-it-yourself website builder called GoCentral. It includes thousands of images that are queried to select the relevant few to build a first version of a small business’ website. When we launched the service in the US, almost all the images showed Caucasians when they featured people. Our customer care team and an early customer immediately pointed out that this did not work for websites selling to minorities. In addition, we had global aspirations for the product, so showing solely Caucasian faces, where other skin types are in the majority, wasnt representative. The team threw out those images and started again, pulling in a much more diverse set of images.

Find a diverse training data set. If your objectives are to serve a broad audience, it is critical to expand your sources to get enough data for each segment. As voice-command user interfaces are growing in usage, there are many people who are frustrated because they are not understood. Why? Because they have an accent. Much of the early training data for voice services came from researchers who had collected data using neutral accents (like those of broadcasters) and college students (a homogenous group). Google’s voice recognition services are expanding the training set data, setting an example of how to tackle problems of narrow data sources leading to biased and therefore unusable services.

We’ve also found at GoDaddy that we need to explicitly obtain data that covers customers with enough depth to get good results for each group. An example is that our domain search modeling team initially applied our US model in deciding what domain names to show in other countries and saw it perform worse than having no smarts at all. Why? The international customers behaved quite differently when exposed to the same patterns. Only once we broke out specific countries into separate models did we see significant improvement. 

The pace of law lags that of technology, as the latter drives innovation and the former waits to see the results before passing legislation or creating policies. If companies are forward-thinking in their application of predictive analytics, AI, and machine learning they can make these technologies inclusive without the need for new laws or regulations. There are clear steps available to all companies, regardless of size, to bring this technology to all people. It starts with someone on the team asking the question, “Are we building a vision of the future that includes everyone?

Steven Aldrich is chief product officer at GoDaddy. Its Steven's job to deliver innovative, integrated and differentiated products. Steven joined GoDaddy in 2012, initially leading GoDaddys Productivity business. He spent over a decade at Intuit, where he helped grow the consumer and small business divisions by moving them from the desktop to the web. He also co-founded an online service that simplified shopping for insurance and has been the CEO of two other venture-funded start-ups. Steven earned an M.B.A. from Stanford and a B.A. in physics from University of North Carolina. Steven serves as a member of the Board of Directors of Blucora.

B�r A. Williams, Esq. is Associate General Counsel at Marqeta. Previously she was Head of Business Operations Management, North America at StubHub, where she was responsible for business planning and operations to manage and oversee technical metrics, product innovation, and partnerships and drive P&L results across the company. Prior to StubHub, B�r was a commercial attorney at Facebook supporting internet.org connectivity efforts, building aircraft, satellites, and lasers, along with purchasing and procurement.

  附件:

《The Need for Inclusion in AI and Machine Learning》--原文.pdf

《The Need for Inclusion in AI and Machine Learning》--译文.pdf

联系我们
办公地点:中国电子技术标准化研究院
地址:北京安定门东大街1号
邮编:100007
电话:010-64102639
邮箱:cciahyz@china-cia.org.cn

微信公众号