Categories
Life Style

fresh incentives for reporting negative results

[ad_1]

Sarahanne Field giving a talk

The editor-in-chief of the Journal of Trial & Error, Sarahanne Field wants to publish the messy, null and negative results sitting in researchers’ file drawers.Credit: Sander Martens

Editor-in-chief Sarahanne Field describes herself and her team at the Journal of Trial & Error as wanting to highlight the “ugly side of science — the parts of the process that have gone wrong”.

She clarifies that the editorial board of the journal, which launched in 2020, isn’t interested in papers in which “you did a shitty study and you found nothing. We’re interested in stuff that was done methodologically soundly, but still yielded a result that was unexpected.” These types of result — which do not prove a hypothesis or could yield unexplained outcomes — often simply go unpublished, explains Field, who is also an open-science researcher at the University of Groningen in the Netherlands. Along with Stefan Gaillard, one of the journal’s founders, she hopes to change that.

Calls for researchers to publish failed studies are not new. The ‘file-drawer problem’ — the stacks of unpublished, negative results that most researchers accumulate — was first described in 1979 by psychologist Robert Rosenthal. He argued that this leads to publication bias in the scientific record: the gap of missing unsuccessful results leads to overemphasis on the positive results that do get published.

Over the past 30 years, the proportion of negative results being published has decreased further. A 2012 study showed that, from 1990 to 2007, there was a 22% increase in positive conclusions in papers; by 2007, 85% of papers published had positive results1. “People fail to report [negative] results, because they know they won’t get published — and when people do attempt to publish them, they get rejected,” says Field. A 2022 survey of researchers in France in chemistry, physics, engineering and environmental sciences showed that, although 81% had produced relevant negative results and 75% were willing to publish them, only 12.5% had the opportunity to do so2.

One factor that is leading some researchers to revisit the problem is the growing use of predictive modelling using machine-learning tools in many fields. These tools are trained on large data sets that are often derived from published work, and scientists have found that the absence of negative data in the literature is hampering the process. Without a concerted effort to publish more negative results that artificial intelligence (AI) can be trained on, the promise of the technology could be stifled.

“Machine learning is changing how we think about data,” says chemist Keisuke Takahashi at Hokkaido University in Japan, who has brought the issue to the attention of the catalysis-research community. Scientists in the field have typically relied on a mixture of trial and error and serendipity in their experiments, but there is hope that AI could provide a new route for catalyst discovery. Takahashi and his colleagues mined data from 1,866 previous studies and patents to train a machine-learning model to predict the best catalyst for the reaction between methane and oxygen to form ethane and ethylene, both of which are important chemicals used in industry3. But, he says, “over the years, people have only collected the good data — if they fail, they don’t report it”. This led to a skewed model that, in some cases, enhanced the predicted performance of a material, rather than realistically assessing its properties.

Portrait of Felix Strieth-Kalthoff in the lab

Synthetic organic chemist Felix Strieth-Kalthoff found that published data were too heavily biased toward positive results to effectively train an AI model to optimize chemical reaction yields.Credit: Cindy Huang

Alongside the flawed training of AI models, the huge gap of negative results in the scientific record continues to be a problem across all disciplines. In areas such as psychology and medicine, publication bias is one factor exacerbating the ongoing reproducibility crisis — in which many published studies are impossible to replicate. Without sharing negative studies and data, researchers could be doomed to repeat work that led nowhere. Many scientists are calling for changes in academic culture and practice — be it the creation of repositories that include positive and negative data, new publication formats or conferences aimed at discussing failure. The solutions are varied, but the message is the same: “To convey an accurate picture of the scientific process, then at least one of the components should be communicating all the results, [including] some negative results,” says Gaillard, “and even where you don’t end up with results, where it just goes wrong.”

Science’s messy side

Synthetic organic chemist Felix Strieth-Kalthoff, who is now setting up his own laboratory at the University of Wuppertal, Germany, has encountered positive-result bias when using data-driven approaches to optimize the yields of certain medicinal-chemistry reactions. His PhD work with chemist Frank Glorius at the University of Münster, Germany, involved creating models that could predict which reactants and conditions would maximize yields. Initially, he relied on data sets that he had generated from high-throughput experiments in the lab, which included results from both high- and low-yield reactions, to train his AI model. “Our next logical step was to do that based on the literature,” says Strieth-Kalthoff. This would allow him to curate a much larger data set to be used for training.

But when he incorporated real data from the reactions database Reaxys into the training process, he says, “[it] turned out they don’t really work at all”. Strieth-Kalthoff concluded the errors were due the lack of low-yield reactions4; “All of the data that we see in the literature have average yields of 60–80%.” Without learning from the messy ‘failed’ experiments with low yields that were present in the initial real-life data, the AI could not model realistic reaction outcomes.

Although AI has the potential to spot relationships in complex data that a researcher might not see, encountering negative results can give experimentalists a gut feeling, says molecular modeller Berend Smit at the Swiss Federal Institute of Technology Lausanne. The usual failures that every chemist experiences at the bench give them a ‘chemical intuition’ that AI models trained only on successful data lack.

Smit and his team attempted to embed something similar to this human intuition into a model tasked with designing a metal-organic framework (MOF) with the largest known surface area for this type of material. A large surface area allows these porous materials to be used as reaction supports or molecular storage reservoirs. “If the binding [between components] is too strong, it becomes amorphous; if the binding is too weak, it becomes unstable, so you need to find the sweet spot,” Smit says. He showed that training the machine-learning model on both successful and unsuccessful reaction conditions created better predictions and ultimately led to one that successfully optimized the MOF5. “When we saw the results, we thought, ‘Wow, this is the chemical intuition we’re talking about!’” he says.

According to Strieth-Kalthoff, AI models are currently limited because “the data that are out there just do not reflect all of our knowledge”. Some researchers have sought statistical solutions to fill the negative-data gap. Techniques include oversampling, which means supplementing data with several copies of existing negative data or creating artificial data points, for example by including reactions with a yield of zero. But, he says, these types of approach can introduce their own biases.

Portrait of Ella Peltonen

Computer scientist Ella Peltonen helped to organize the first International Workshop on Negative Results in Pervasive Computing in 2022 to give researchers an opportunity to discuss failed experiments.Credit: University of Oulu

Capturing more negative data is now a priority for Takahashi. “We definitely need some sort of infrastructure to share the data freely.” His group has created a website for sharing large amounts of experimental data for catalysis reactions. Other organizations are trying to collect and publish negative data — but Takahashi says that, so far, they lack coordination, so data formats aren’t standardized. In his field, Strieth-Kalthoff says, there are initiatives such as the Open Reaction Database, launched in 2021 to share organic-reaction data and enable training of machine-learning applications. But, he says, “right now, nobody’s using it, [because] there’s no incentive”.

Smit has argued for a modular open-science platform that would directly link to electronic lab notebooks to help to make different data types extractable and reusable. Through this process, publication of negative data in peer-reviewed journals could be skipped, but the information would still be available for researchers to use in AI training. Strieth-Kalthoff agrees with this strategy in theory, but thinks it’s a long way off in practice, because it would require analytical instruments to be coupled to a third-party source to automatically collect data — which instrument manufacturers might not agree to, he says.

Publishing the non-positive

In other disciplines, the emphasis is still on peer-reviewed journals that will publish negative results. Gaillard, a science-studies PhD student at Radboud University in Nijmegen, the Netherlands, co-founded the Journal of Trial & Error after attending talks on how science can be made more open. Gaillard says that, although everyone whom they approached liked the idea of the journal, nobody wanted to submit articles at first. He and the founding editorial team embarked on a campaign involving cold calls and publicity at open-science conferences. “Slowly, we started getting our first submissions, and now we just get people sending things in [unsolicited],” he says. Most years the journal publishes one issue of about 8–14 articles, and it is starting to publish more special issues. It focuses mainly on the life sciences and data-based social sciences.

In 2008, David Alcantara, then a chemistry PhD student at the University of Seville in Spain who was frustrated by the lack of platforms for sharing negative results, set up The All Results journals, which were aimed at disseminating results regardless of the outcome. Of the four disciplines included at launch, only the biology journal is still being published. “Attracting submissions has always posed a challenge,” says Alcantara, now president at the consultancy and training organization the Society for the Improvement of Science in Seville.

But Alcantara thinks there has been a shift in attitudes: “More established journals [are] becoming increasingly open to considering negative results for publication.” Gaillard agrees: “I’ve seen more and more journals, like PLoS ONE, for example, that explicitly mentioned that they also publish negative results.” (Nature welcomes submissions of replication studies and those that include null results, as described in this 2020 editorial.)

Journals might be changing their publication preferences, but there are still significant disincentives that stop researchers from publishing their file-drawer studies. “The current academic system often prioritizes high-impact publications and ground-breaking discoveries for career advancement, grants and tenure,” says Alcantara, noting that negative results are perceived as contributing little to nothing to these endeavours. Plus, there is still a stigma associated with any kind of failure. “People are afraid that this will look negative on their CV,” says Gaillard. Smit describes reporting failed experiments as a no-win situation: “It’s more work for [researchers], and they don’t get anything in return in the short term.” And, jokes Smit, what’s worse is that they could be providing data for an AI tool to take over their role.

Ultimately, most researchers conclude that publishing their failed studies and negative data is just not worth the time and effort — and there’s evidence that they judge others’ negative research more harshly than positive outcomes. In a study published in August, 500 researchers from top economics departments around the world were randomized to two groups and asked to judge a hypothetical research paper. Half of the participants were told that the study had a null conclusion, and the other half were told the results were sizeably significant. The null results were perceived to be 25% less likely to be published, of lower quality and less important than were the statistically significant findings6.

Some researchers have had positive experiences sharing their unsuccessful findings. For example, in 2021, psychologist Wendy Ross at the London Metropolitan University published her negative results from testing a hypothesis about human problem-solving in the Journal of Trial & Error7, and says the paper was “the best one I have published to date”. She adds, “Understanding the reasons for null results can really test and expand our theoretical understanding.”

Fields forging solutions

The field of psychology has introduced one innovation that could change publication biases — registered reports (RRs). These peer-reviewed reports, first published in 2014, came about largely as a response to psychology’s replication crisis, which began in around 2011. RRs set out the methodology of a study before the results are known, to try to prevent selective reporting of positive results. Daniël Lakens, who studies science-reward structures at Eindhoven University of Technology in the Netherlands, says there is evidence that RRs increase the proportion of negative results in the psychology literature.

In a 2021 study, Lakens analysed the proportion of published RRs whose results eventually support the primary hypothesis. In a random sample of hypothesis-testing studies from the standard psychology literature, 96% of the results were positive. In RRs, this fell to only 44%8. Lakens says the study shows “that if you offer this as an option, many more null results enter the scientific literature, and that is a desirable thing”. At least 300 journals, including Nature, are now accepting RRs, and the format is spreading to journals in biology, medicine and some social-science fields.

Yet another approach has emerged from the field of pervasive computing, the study of how computer systems are integrated into physical surroundings and everyday life. About four years ago, members of the community started discussing reproducibility, says computer scientist Ella Peltonen at the University of Oulu in Finland. Peltonen says that researchers realized that, to avoid the repetition of mistakes, there was a need to discuss the practical problems with studies and failed results that don’t get published. So in 2022, Peltonen and her colleagues held the first virtual International Workshop on Negative Results in Pervasive Computing (PerFail), in conjunction with the field’s annual conference, the International Conference on Pervasive Computing and Communications.

Peltonen explains that PerFail speakers first present their negative results and then have the same amount of time for discussion afterwards, during which participants tease out how failed studies can inform future work. “It also encourages the community to showcase that things require effort and trial and error, and there is value in that,” she adds. Now an annual event, the organizers invite students to attend so they can see that failure is a part of research and that “you are not a bad researcher because you fail”, says Peltonen.

In the long run, Alcantara thinks a continued effort to persuade scientists to share all their results needs to be coupled with policies at funding agencies and journals that reward full transparency. “Criteria for grants, promotions and tenure should recognize the value of comprehensive research dissemination, including failures and negative outcomes,” he says. Lakens thinks funders could be key to boosting the RR format, as well. Funders, he adds, should say, “We want the research that we’re funding to appear in the scientific literature, regardless of the significance of the finding.”

There are some positive signs of change about sharing negative data: “Early-career researchers and the next generation of scientists are particularly receptive to the idea,” says Alcantara. Gaillard is also optimistic, given the increased interest in his journal, including submissions for an upcoming special issue on mistakes in the medical domain. “It is slow, of course, but science is a bit slow.”

[ad_2]

Source Article Link

Categories
News

Matter 1.3 Specification Adds Energy Reporting, Electric Vehicle Charging, Water Management Support and More

[ad_1]

The Connectivity Standards Alliance (CSA) today announced the debut of a new Matter 1.3 specification that’s available for device makers and platforms. Matter is a smart home protocol that allows smart devices to work across multiple platforms, including HomeKit.

matter iot standard
Matter 1.3 adds support for a range of new device types and features, including water management devices, electric vehicle chargers, kitchen and laundry appliances, and TVs.

For smart plugs and other devices, the update includes energy management reporting, allowing users to see actual and estimated measurements of power, voltage, and current, both in-real time and over time. EV Charging manufacturers are able to include Matter-based features such as manually starting and stopping charging, adjusting charging rate, and optimizing charging times.

Water management devices like leak and freeze detectors, rain sensors, and controllable water valves are supported in Matter 1.3, as are several new appliance types including microwave ovens, ovens, cooktops, extractor hoods, and dryers.

For TVs, Matter 1.3 improves casting initialization and search, plus it adds support for push messages and dialog for ambient experiences, expanded interactivity for TV apps, and better interaction with other home devices.

Scenes are supported with the new specification, allowing product makers and platforms to set, read, and activate scenes on devices. Scenes for Matter work like ‌HomeKit‌ scenes, letting users set a desired state for rooms and devices with one command. Matter controllers are also now able to batch multiple commands into a single message when communicating with Matter devices for less delay between command execution.

Matter 1.3 devices and improvements will likely be available on the market later this year, with more information available on the CSA website.

[ad_2]

Source Article Link

Categories
News

ChatGPT vs Excel data analysis and reporting compared

ChatGPT Data Analyst GPT vs Excel

Over the last 18 months artificial intelligence has exploded into our lives already transforming many applications and services that we use everyday.  Now with the OpenAI custom GPT Store  and the availability to create your own custom GPTs very easily. ChatGPT and other AI models are now eating into areas that previously required extensive knowledge of applications and processes for.  One example is the new ChatGPT Data Analyst custom GPT available in the OpenAI GPT Store and has been designed to replace the need for knowledge of Excel functions, such as pivot tables and more. Making it even easier for you to analyze important data about your business, investments or personal life

OpenAI’s ChatGPT large language model stands out for its ability to handle complex data sets with ease. It uses advanced algorithms to navigate through large volumes of data, identifying patterns and trends that might be difficult for humans to spot. Whether it’s understanding customer behavior or analyzing sales data, ChatGPT can do it all. Its analytical skills are so comprehensive that it’s like having a seasoned data analyst at your disposal.

For those looking to maximize the potential of ChatGPT, a premium subscription is available. This subscription gives users access to customized GPT models that are specifically tailored to their data analysis needs. It’s like having a personal data analyst who is ready to tackle any challenge you throw their way.

ChatGPT Data Analyst

Check out the video created by Kevin Stratvert to learn more about the differences between Excel and using the OpenAI ChatGPT Data Analyst custom GPT created by the team at OpenAI are now available from the OpenAI GPT Store.

Here are some other articles you may find of interest on the subject of analyzing large amounts of data using artificial intelligence :

Once you have your subscription, you can start uploading data into ChatGPT. The process is straightforward: you provide the AI with Excel files, and it begins to process the information. It can handle a variety of data, from customer details to transaction histories, and turn these numbers into actionable insights.

ChatGPT’s skill with Excel is particularly noteworthy. It can navigate through spreadsheets with ease, identify key pieces of data, and answer both simple and complex queries. For example, if you need to know who your top customer was in the last quarter, ChatGPT can quickly find that information for you.

But ChatGPT’s talents don’t stop at data processing. It can also create visual representations of data, such as bar charts, which make it easier to understand complex information at a glance. If these visuals need to be adjusted, ChatGPT can modify them to better suit your needs.

Another area where ChatGPT excels is in forecasting and trend analysis. It’s not just about understanding what has happened in the past; ChatGPT can also predict what might happen in the future. By recognizing patterns in your data, it can provide forecasts that help you prepare for what’s to come.

To get the most out of ChatGPT as a data analyst, it’s crucial to ask the right questions. The key is not in mastering data analysis tools but in knowing how to formulate inquiries that prompt ChatGPT to dig deeper and uncover more profound insights. This means that even those with limited data expertise can use AI to gain a better understanding of their information.

ChatGPT vs Excel

1. Capabilities

ChatGPT Data Analyst Custom GPT:

  • AI-Powered Analysis: Utilizes advanced AI algorithms to understand and process data queries, making it capable of handling complex data analysis tasks without requiring the user to have in-depth technical knowledge.
  • Natural Language Processing (NLP): Users can ask questions in natural language, making it accessible to those without a background in data science or Excel.
  • Versatility: Capable of performing a wide range of tasks, from simple data lookups to complex analysis involving multiple data sources and types.
  • Dynamic Visualization: Can generate visual representations of data analysis results, such as bar charts, with the ability to customize aspects like color and format based on user requests.

Microsoft Excel:

  • Comprehensive Toolset: Offers a vast array of built-in functions, formulas, and tools for data analysis, including pivot tables, charts, and conditional formatting.
  • Granular Control: Users have fine-grained control over data manipulation and presentation, allowing for detailed customization of analyses and reports.
  • Data Management: Strong capabilities in organizing, filtering, and sorting large datasets, with features such as Power Query for data import and transformation.
  • Visualization: Excel supports a wide range of charts, graphs, and pivot charts, with extensive customization options.

2. Ease of Use

ChatGPT Data Analyst Custom GPT:

  • User-Friendly: The ability to interact through natural language reduces the learning curve, making it more accessible to users without technical expertise.
  • Fast Setup: Requires minimal setup, as users can simply type questions and get insights almost immediately without navigating complex interfaces.

Microsoft Excel:

  • Learning Curve: Excel has a steeper learning curve, especially for advanced features like pivot tables and complex formulas, requiring training or prior knowledge.
  • Flexibility and Precision: While it offers greater control, this also means that users must know exactly how to manipulate the tool to extract the desired insights.

3. Flexibility and Integration

ChatGPT Data Analyst Custom GPT:

  • Integration with Other Data Sources: Potentially limited by the platform’s current capabilities and access rights, requiring data to be uploaded or manually inputted.
  • Customization: While AI can generate custom analyses and visualizations, there may be limitations compared to the depth of customization available in Excel.

Microsoft Excel:

  • Integration Capabilities: Excel integrates well with other software and databases, allowing for direct data import and analysis.
  • Highly Customizable: Offers in-depth customization for data analysis and reporting, catering to a wide range of professional needs.

4. Potential Applications

ChatGPT Data Analyst Custom GPT:

  • Ideal for users needing quick insights without the necessity for deep dives into the data or for those lacking the technical skills to use Excel effectively.
  • Suitable for generating easy-to-understand explanations and visualizations of data trends and patterns.

Microsoft Excel:

  • Excel remains the tool of choice for detailed data analysis, financial modeling, and situations requiring precise control over data manipulation and presentation.
  • Preferred in professional settings where data analysis needs to be in-depth and highly customized.

The choice between ChatGPT vs Excel depends on the user’s needs, skill level, and the complexity of the data analysis required. ChatGPT Data Analyst excels in accessibility, ease of use, and quick insights, making it suitable for users who need straightforward answers or lack data analysis skills.

Microsoft Excel, on the other hand, offers unparalleled depth and control, catering to professionals who require detailed, customized data analysis and reporting. The future of data analysis may increasingly involve a blend of both AI-driven insights and traditional data manipulation tools, leveraging the strengths of each according to the task at hand.

ChatGPT is reshaping the landscape of data analysis. With a premium subscription, users can unlock tailored GPT models that can dissect data, respond to inquiries, generate visual aids, and anticipate trends. The power of data analysis now lies in your ability to ask ChatGPT the right questions, which opens the door to the insights you need. Embrace this new approach to data analysis with ChatGPT and turn the tide of data to your advantage.

Filed Under: Guides, Top News





Latest timeswonderful Deals

Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, timeswonderful may earn an affiliate commission. Learn about our Disclosure Policy.