As an employer, Amazon is much in demand and the company receives a flood of applications. Little wonder, therefore that they are seeking ways to automate the pre-selection process, which is why the company developed an algorithm to filter out the most promising applications.
This AI algorithm was trained using employee data sets to enable it to learn who would be a good fit for the company. However, the algorithm systematically disadvantaged women. Because more men had been recruited in the past, far more of the training data sets related to men than women, as a result of which the algorithm identified gender as a knockout criterion. Amazon finally abandoned the system when it was found that this bias could not be reliably ruled out despite adjustments to the algorithm.
This example shows how quickly someone could be placed at a disadvantage in a world of algorithms, without ever knowing why, and often without even knowing it. “Should this happen with automated music recommendations or machine translation, it may not be critical,” says Marco Huber, “yet it is a completely different matter when it comes to legally and medically relevant issues or in safety-critical industrial applications.”
AI algorithms are increasingly taking decisions that have a direct impact on humans.
IDENTIFYING RISKS AND SIDE EFFECTS
In Sarah Oppold’s opinion, this is an example of an algorithm implemented in an inadequate manner. “The input data was unsuitable and the problem to be solved was poorly formulated,” says the computer scientist, who is currently completing her doctoral studies at the University of Stuttgart’s Institute of Parallel and Distributed Systems (IPVS), where she is researching how best to design AI algorithms in a transparent manner. “Whilst many research groups are primarily focusing on the model underlying the algorithm,” Oppold explains, “we are attempting to cover the entire chain, from the collection and pre-processing of the data through the development and parameterization of the AI method to the visualization of the results.” Thus, the objective in this case is not to produce a white box for individual AI applications, but rather to represent the entire life cycle of the algorithm in a transparent and traceable manner.
The result is a kind of regulatory framework. In the same way that a digital image contains metadata, such as exposure time, camera type and location, the framework would insert explanatory notes to an algorithm – for example, that the training data refers to Germany and that the results, therefore, are not transferable to other countries. “You could think of it like a drug,” says Oppold: “It has a specific medical application and a specific dosage, but there are also associated risks and side effects. Based on that information, the health care provider will decide which patients the drug is appropriate for.”
IDENTIFYING RISKS AND SIDE EFFECTS
In Sarah Oppold’s opinion, this is an example of an algorithm implemented in an inadequate manner. “The input data was unsuitable and the problem to be solved was poorly formulated,” says the computer scientist, who is currently completing her doctoral studies at the University of Stuttgart’s Institute of Parallel and Distributed Systems (IPVS), where she is researching how best to design AI algorithms in a transparent manner. “Whilst many research groups are primarily focusing on the model underlying the algorithm,” Oppold explains, “we are attempting to cover the entire chain, from the collection and pre-processing of the data through the development and parameterization of the AI method to the visualization of the results.” Thus, the objective in this case is not to produce a white box for individual AI applications, but rather to represent the entire life cycle of the algorithm in a transparent and traceable manner.
The result is a kind of regulatory framework. In the same way that a digital image contains metadata, such as exposure time, camera type and location, the framework would insert explanatory notes to an algorithm – for example, that the training data refers to Germany and that the results, therefore, are not transferable to other countries. “You could think of it like a drug,” says Oppold: “It has a specific medical application and a specific dosage, but there are also associated risks and side effects. Based on that information, the health care provider will decide which patients the drug is appropriate for.”
The framework has not yet been developed to the point where it can perform comparable tasks for an algorithm. “It currently only takes tabular data into account,” Oppold explains: “We now want to expand it to take in imaging and streaming data.” A practical framework would also need to incorporate interdisciplinary expertise, for example from AI developers, the social sciences and lawyers. “As soon as the framework reaches a certain level of maturity,” the computer scientist explains, “it would make sense to collaborate with the industrial sector to develop it further and make the algorithms used in industry more transparent .”
No comments:
Post a Comment