curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: A CI job inventory

From: Timothe Litt <litt_at_acm.org>
Date: Mon, 7 Feb 2022 18:41:54 -0500

Agree with the thrust of these comments.

Perhaps rather than add metadata, have a utility that each CI job runs
to update a database at setup and/or exit.

This would also automate updating descriptions, etc as well as entering
new jobs without a separate process.

It could give you actual runtimes, fail counts, etc.

The mechanics might be a bit involved, but not difficult - probably the
utility would have to send its updates to a server since most CI
environments don't provide a persistent story - and you'd want data from
all environments in one place anyhow.

Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.

On 07-Feb-22 18:07, Dan Fandrich via curl-library wrote:
> On Mon, Feb 07, 2022 at 11:10:39PM +0100, Daniel Stenberg via curl-library wrote:
>> In order to get better overview and control of the jobs we run, I'm
>> proposing that we create and maintain a single file that lists all the jobs
>> we run. This "database" of jobs could then be used to run checks against and
>> maybe generate some tables or charts and what not to help us make sure our
>> CI jobs really covers as many build combinations as possible and perhaps it
>> can help us reduce duplications or too-similar builds.
> I suspect we will be able to count the time in hours before such a list
> diverges from the actual CI jobs being run because somebody forgot to update
> the master list properly. Such a list will be pure duplication of information
> already found in the CI configuration files, too. I would rather treat the CI
> files as the sources of truth and derive a dashboard by parsing those instead,
> to show the jobs that are *actually* being run. The down side to that, of
> course, is that you'd need to write code to parse 6 different CI configuration
> file formats, but the significant benefit is that you could always trust the
> dashboard.
>
> Another approach would be to add metadata to the different CI configuration
> files that the dashboard could read from each file in a consistent format, such
> as a specially-formatted comment, structured job title, and/or special
> environment variable definition. That makes parsing easier, but it means that
> people would need to remember to update the metadata when they update or add a
> job. The metadata could still fall out of date for that reason, but it's less
> likely to happen than with a separate, central job registry because the
> metadata will always be found along with the job configuration. It should also
> be relatively easy to at least count the number of jobs defined in each CI
> configuration file and flag those without a special metadata line (catching new
> uncategorised jobs).
>
> Maybe a hybrid approach is the best; read and parse as much job data as
> practical from the job name and "env" section of each CI configuration file
> (which should be pretty simple and stable to retrieve), and supplement that
> with additional data from a structured comment (or magic "env" variable), where
> necessary.
>
> The only way I'd advocate for a new central job description file is if it could
> be used to mechanically generate the CI job files. That would mean there would
> be only one source of truth, but this approach would also be pretty impractical
> due to the complexity of many job configurations and the need to write 6
> different configuration file formats.
>
> Dan

-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2022-02-08