Continue reading for background on traditional R&D studies, objectives of using machine learning software in R&D tax credit services, and more key results from this case study.
An international tax consulting firm who provides R&D Tax Credit services to mid-market and large businesses across the globe sought to increase efficiency in its highly manual R&D study process using machine learning-driven R&D tax credit automation software.
In a traditional R&D study, accountants conduct time allocation interviews with R&D-involved employees from the client company (engineers, engineering managers, and the like) in an effort to determine how much of each employee's time was spent on Qualified Research Activities (QRAs) as defined by the IRS.
Following the interviews, the firm's tax professionals craft summaries of their findings to serve as substantive evidence of the client's eligibility for the tax credit. This documentation serves to protect the client's tax credit in the event of a future IRS audit.
Traditional interview-based R&D studies can cost tax professionals and clients hundreds of hours, significant operational disruption, and the potential for audit risk associated with estimates, recency bias, and lack of sufficient evidence.
It's no secret that teams in many industries use cloud-based software platforms on a day-to-day basis throughout the year to manage their to-do lists, collaborate with teammates, and track overall project completion.
The use of such tools creates task- and project-related data that's well-suited for R&D tax credit analyses. But the volume of data makes it unrealistic (at least, economically) for a tax expert to manually comb through it in search of R&D tax credit insights and findings.
As such, tax experts have traditionally resorted to taking a practical but inefficient approach to determining the client's Qualified Research Expenses: they conduct interviews with client employees to learn about the R&D projects they worked on during the previous tax year and evaluate the extent to which their activities on those projects met the IRS criteria for qualified research.
Now, RetroacDev is unlocking new capabilities for R&D tax credit experts- the power to harness an enormous amount of client project-tracking data from tools like GitHub, Jira, and Asana to quickly and accurately accomplish the objective of the traditional R&D employee interview.
RetroacDev's powerful machine learning-driven Qualified Research Activity (QRA) classifier analyzes each task in the client's dataset, estimates how long it took, and classifies into one of three broad categories of research-related activities defined by the IRS Audit Guidelines:
Then RetroacDev constructs a summary and data-driven insights about the portion of each employee's time that was qualified.
Finally, the platform uses the data to automatically reconstruct a contemporaneous record of all qualified activities performed by each employee on a day-to-day basis for the entire tax year. This work log is used to substantiate the business' Qualified Research Expenses in the event of an audit.
Screenshot: Example of an Activity Breakdown for an R&D project in RetroacDev.
Using this combination of machine learning (ML), big data, and automation, RetroacDev is able to significantly reduce or entirely eliminate:
One of RetroacDev's greatest benefits is that it doesn't require onboarding client engineering teams to a new or separate tool where they have to re-record their R&D-related work solely for tax purposes - a responsibility that's likely not listed in any engineer's job description.
In fact, using RetroacDev doesn't require client engineers to change their workflow in any manner.
The platform is compatible with the most popular cloud-based tools engineers use day to day to manage their to-do's, communicate with teammates, and track overall project completion. The list of compatible tools includes:
Software engineering teams
Teams in any industry
Virtually all software teams now use Git, an open-source software for tracking code changes. Software developers use Git as to collaborate on writing source code. It's designed to handle everything from small to very large projects with speed and efficiency.
Diagram: Git helps engineers manage coding changes without interfering with the existing master codebase.
The (free) open-source Git technology underlies each of the most popular (paid) software engineering project management platforms: GitHub, GitLab, Microsoft Azure DevOps, Bitbucket, and others.
Commercial platforms like GitHub provide additional features teams can use throughout the entire product development lifecycle, including creating and assigning engineering tasks, tracking bugs, enabling engineers to review each others' code, and managing code deployment and compiling.
Importantly, the Git data RetroacDev requires is not the team's actual source code or intellectual property, but change-log metadata organically recorded each time an engineer saves a change to the code base. From an IT security perspective, it's impossible to reverse engineer source code from the data RetroacDev requires.
To test the efficacy and ROI of using RetroacDev, the international tax firm's R&D credit team performed an R&D study for an IT group totaling 80 software engineers.
For the same reasons described in the previous section, the IT group's Git data was chosen as the primary dataset for analysis.
The firm's objectives were twofold:
The tax team began the R&D study by requesting Git data from the IT group's four engineering managers.The group itself is comprised of several software teams brought together via acquisitions of various US-based software businesses over the last decade.
Each engineering manager was tasked with exporting data from the Git "repositories" -- the folders that organize a codebase into files -- their team worked on during the year. To export this data, engineering managers run a one-line code command in each repository, which creates a human-readable text file (.txt) ready for import into RetroacDev. It only takes a few seconds to run the command and export data from each repository.
In total, data was exported from 56 Git repositories, grouped into 16 R&D projects, and uploaded into RetroacDev.
(RetroacDev also has a direct API integration with GitHub, the most popular Git platform, which allows managers the ability to easily select some or all repositories for import into RetroacDev, instead of using the manual export/import method.)
As the primary tool used in the software development process, Git often provides excellent data coverage for full-time software engineers.The dataset provided in this R&D study covered 84 software engineers and technical managers, who, on average, each made 191 code changes. Put differently, on average, each engineer completed one engineering task every 1.36 work days.
To determine the efficacy of RetroacDev's data-driven approach, the tax team consulted the IT group's engineering managers to review the software's Qualified R&D Time calculations for each engineer.
In cases where manual adjustments need to be made, RetroacDev users have the ability to override the software's calculations with their own. But that proved unnecessary in this study.
Together the engineering managers concluded that RetroacDev's calculations were within a 3-5% range of what the tax team determined in the previous period using its traditional interview-based approach.
Using RetroacDev helped the firm's R&D credit team save hundreds of person-hours they otherwise would have spent conducting manual time allocation interviews and write-ups for 80+ software engineers.
"RetroacDev generated in seconds information previously gathered via the manual conduct of time allocation interviews for 80+ software employees. The usage of this software resulted in tax credit savings over $900,000 and saved the company hundreds of person-hours."
- Director, R&E Tax Credits
International Tax Consulting Firm
In the end, RetroacDev produced tax credit savings of over $900,000.
Regarding the quality of the automatically generated R&D activity documentation, the tax firm's Director of R&E Credit services concluded, "In addition to efficiencies gained, the substantiating information generated by the analyzing software exceeds exponentially that gathered utilizing the prior interview process."
The distribution of R&D employees- in this case, software engineers- and their calculated Qualified R&D Time.
|Total interview time eliminated, tax team||More than 100 hours|
|Total software cost to tax firm||$8,683.35|
|Total R&D tax credit savings||More than $900,000|
|Data source analyzed||Git|
|# of R&D employees covered||80+|
|Avg. # of tasks completed per employee, annually||191 per employee, per year|
|Accuracy of RetroacDev's Qualified R&D Time calculations||Within 3-5% of interview results|
|Qualified R&D Time||# of Employees|
|Substantially all (80% or more)||32|