Difference between revisions of "Buildroot:DeveloperDaysFOSDEM2020"

From eLinux.org
Jump to: navigation, search
(Participants)
 
(23 intermediate revisions by 11 users not shown)
Line 16: Line 16:
 
# [[User:ThomasPetazzoni|Thomas Petazzoni]]
 
# [[User:ThomasPetazzoni|Thomas Petazzoni]]
 
# [[User:PeterKorsgaard|Peter Korsgaard]]
 
# [[User:PeterKorsgaard|Peter Korsgaard]]
# [[User:PeterKorsgaard|Angelo Compagnucci]]
+
# [[User:angeloc|Angelo Compagnucci]]
 +
# [[User:Arnout_Vandecappelle|Arnout Vandecappelle]]
 +
# [[User:ITitou|Titouan Christophe]]
 +
# [[User:Hthiery|Heiko Thiery]] (till Tuesday afternoon)
 +
# [[User:Aduskett|Adam Duskett]]
 +
# Michael Walle (till Tuesday afternoon)
 +
# Thomas De Schampheleire (will be physically present on Tuesday and Wednesday only; could join remotely on Monday)
 +
# Mark Corbin
 +
# [[User:ymorin|Yann E. MORIN]]
 +
# Romain Naour
 +
# Jochen Baltes (till Tuesday afternoon)
  
 
== Meeting agenda ==
 
== Meeting agenda ==
  
To be defined.
+
* Autobuilder stats and dashboarding with Grafana (Titouan)
 +
* Continue [[Buildroot:Python2Packages|Python2 deprecation]] (Titouan)
 +
* Rust/cargo download infrastructure: ripgrep currently does downloads during the build
 +
** Approach similar to what was discussed for Go?
 +
* Patch backlog management: shall we experiment with Merge Requests on gitlab? (Arnout)
 +
** Why? Is faster because the .gitlab-ci.yml checks are already executed, and for a bigger series of simple patches it reduces the time for review+merge from 30 minutes to 10 minutes.
 +
** Proposal: try it out with a few well-known developers, and only for "low-hanging fruit" series.
 +
** Thomas: nooooo :-)
 +
** Peter: I am also not really convinced
 +
* Status of top-level parallel build support, where do we go from there
 +
 
 +
== Notes from the meeting ==
 +
 
 +
=== Google summer of code ===
 +
 
 +
* First let's check if we have mentors. Nobody volunteers, so it doesn't make sense to register for it.
 +
* Last year we didn't invest a lot of time on mentoring. Results would have been better with better (more) mentoring. ThomasP had an intern around the same time and that intern needed a lot less mentoring time than our GSoC student, so really it depends on the student how much time is required and what results you can expect.
 +
* It is more efficient to mentor interns in a company than doing a GSoC project. The Buildroot Association could sponsor such endeavours, but we have to see how that would work on an admin/legal point of view.
 +
 
 +
=== Rust/cargo download infrastructure: ripgrep currently does downloads during the build ===
 +
 
 +
* Approach similar to what was discussed for Go?
 +
* PHP has a similar thing (php-composer).
 +
* The problem: a cargo package has dependencies (typically lots of them, e.g. ripgrep which is relatively small has 46 dependencies in total (recursively)). Those dependencies need to be downloaded before we can start the build. At the moment, they are downloaded during the build step, automatically, by cargo. We would want that:
 +
** They are downloaded in the download step.
 +
** They are downloaded to DL_DIR, so they can be reused between different builds. Note that this requires locking.
 +
** They are properly hashed and verified. Note that cargo already takes care of this.
 +
** legal-info works, i.e. the tarballs and patches of dependencies are also put in the legal-info directory.
 +
*** the licenses and license files still need to be put into the .mk file "manually", but whatever tooling is used (licensecheck, fossology) can use this tarball as input.
 +
** They are fetched from PRIMARY and BACKUP site.
 +
** PRIMARY_ONLY works.
 +
** It should be possible to patch the dependencies.
 +
** If several packages have the same dependency (at the same version), that dependency shouldn't be re-downloaded over and over again.
 +
** Ideally all the package manager can share some infrastructure.
 +
* Note that we will assume that all dependencies are "vendored": we're not going to try to make separate packages for all of the dependencies. That would be fighting too much with upstream.
 +
* Some things we want to keep explicit anyway: LICENSES and LICENSE_FILES. For these, however, we can have a script like scanpypi that generates it.
 +
* Four possible approaches were discussed:
 +
** If cargo supports it, we could convince it to use PRIMARY and BACKUP directly. In other words, the dependencies are fetched in a post-download hook. It may be impossible to implement all wanted features that way, or explicit support is needed for them (e.g. legal-info, patching dependencies). Everyone agrees this is not the way to go.
 +
** 'cargo' is a full download method like git, and it creates a single tarball like git with submodules. All the infrastructure for PRIMARY and legal-info can be reused. CARGO_HOME would be inside the tarball, so patches can be applied to dependencies without problem. The only limitation of this approach is that the last requirement (single download of dependencies) won't work. A nice thing about this is that as soon as it's a cargo package, it's automatically cargo download method. So converting existing packages is trivial and it's really easy on the user. We can still override the upstream-provided Cargo.lock file, but it would need some special treatment in the infra.
 +
** Make the dependencies explicit; don't rely on cargo to download them, but specify them as _EXTRA_DOWNLOADS (or something which automatically populates that). The normal download infrastructure is used for everything. This requires converting the Cargo.lock to a buildroot-specific format, but that can be automated. Although such a script is not going to be trivial,
 +
** Don't do anything. ripgrep as it is works, though it breaks all of the wish list we did above.
 +
* We conclude that the second option is the one we should go for.
 +
** There should be a new download method for cargo, go, composer.
 +
** This download method calls directly into the appropriate download helper to download the upstream package. The download method for that is still specified with _SITE_METHOD. It is passed along with an extra + in front of the URL, i.e. cargo+git+https://foo.org/proj/ and the cargo download helper just passes the last part down to the upstream download. Note that this download will not make use of PRIMARY or BACKUP, it only downloads from the original site. The result of this download is stored in a temporary file and will not be exposed - it will be removed when the temporary directory created by dl-wrapper is removed.
 +
*** We could optimise things a bit by not creating a tarball at all when using the git/svn/hg download methods. But that can be done later.
 +
** There is no need to hash and verify this download - the final tarball will be hashed an verified.
 +
** The cargo download method extracts the upstream package, runs cargo on it with CARGO_HOME pointing inside the extracted directory, then makes a tarball of the result. That tarball will have the name of $(PKG)_SOURCE.
 +
** The tarball that includes all dependencies is used for everything else: PRIMARY and BACKUP, legal-info, offline builds, ...
 +
* Note that the RFC series http://patchwork.ozlabs.org/project/buildroot/list/?series=156166 can still be applied as an intermediate step.
 +
 
 +
=== Patch backlog management: shall we experiment with Merge Requests on gitlab? (Arnout) ===
 +
 
 +
* Why? Is faster because the .gitlab-ci.yml checks are already executed, and for a bigger series of simple patches it reduces the time for review+merge from 30 minutes to 10 minutes.
 +
* Proposal: try it out with a few well-known developers, and only for "low-hanging fruit" series.
 +
* Thomas: nooooo :-)
 +
* Peter: I am also not really convinced
 +
* Yann: not in favour of it, by a long shot... :-/
 +
* Patchwork has a feature where you can create a bundle of a series and download it in one shot.
 +
* For the CI part, you can actually just push to a gitlab fork and point to the pipeline in the cover letter.
 +
* Having two possible workflows is going to be confusing for contributors.
 +
* One reason to do this would be to attract more contributions. However, we're really not lacking contributions at the moment...
 +
* Maybe we should also document more things about the workflow. E.g. putting the mailing list in nomail mode.
 +
 
 +
Conclusion: we're not going to do this.
 +
 
 +
=== Documentation improvements ===
 +
 
 +
* It would be nice if we could use a workflow like the kernel, using ReStructuredText source and sphinx to generate output for it.
 +
* People do use the PDF output, so we certainly have to keep that.
 +
 
 +
=== Autobuilder stats and dashboarding with Grafana (Titouan) ===
 +
 
 +
* PoC is live on one of Adam's servers. Log in is required to get in. Ask iTitou on IRC to get access.
 +
* We can move it to OSUOSL. We should also move the autobuild server there.
 +
 
 +
=== Continue Python2 deprecation (Titouan) ===
 +
 
 +
* Upstream EOL of python2 has been delayed a bit, so we don't need to force deprecate immediately.
 +
* => Anyhow, this is just giving us 4 additional months, so no big changes on the roadmap
 +
* All patches for waf are now on patchwork.
 +
* Titouan keeps the wiki page up to date. (see https://elinux.org/Buildroot:Python2Packages)
 +
* Next step is to build host-python3 by default, and only build host-python if we build target python.
 +
* Principle going forward is that a package that needs python during the build will use python3, independent of the python that is on the target.
 +
** Packages that really need python2 during the build can of course still depend on host-python
 +
** Packages that just need a python during the build will depend on host-python3 and use $(HOST_DIR)/bin/python3
 +
** It is not required to change the existing packages right away. It's just a principle for anything new, or when a package is anyway modified.
 +
** host python modules will always use host-python3. So the host-python-package infra should be adapted to handle that.
 +
** Note that this means that when you build python2 for the target, you will most likely build both host-python and host-python3. So be it.
 +
 
 +
=== Status of top-level parallel build support, where do we go from there ===
 +
 
 +
* Status is tracked in https://elinux.org/Buildroot:Top_Level_Parallel_Build
 +
* Package-specific issues (e.g. qt5, python modules)
 +
* make foo-reinstall is currently broken: if you reinstall a package foo which is also a dependency of another package bar, then bar will still have the old version of foo in its per-package directories. In the final rsync, the old foo may survive (depending on the order of the rsync).
 +
* Other issue when different packages are changing the same files, using append (i.e. not replacing the existing hardlink but changing the linked file, which is in the per-package directory of another package).
 +
* We could take one step: in the installation changing STAGING_DIR/TARGET_DIR to something else. There are probably some packages that still write to the STAGING_DIR set during configure, so (for the time being) we should still use the "old" staging dir as well. But it is a good step towards identifying conflicts.
 +
 
 +
=== Status of reproducible builds support, where do we go from there ===
 +
 
 +
* Current status: autobuilder will set BR2_REPRODUCIBLE and do a second build and check if they are identical
 +
* The output is not always so clear. E.g. the reason is "unknown".
 +
* diffoscope is a useful tool: https://diffoscope.org/
 +
* Use case for ThomasDS: when after a few years you rebuild an old version, you want to be sure that you are really looking at the same thing. Binary identical helps to be sure.
 +
* For such use cases, you really should still store your build environment, and build in the same path. What works well at the moment is variation of user and time. Variation of build directory is not so great, and variation of build environment (distro) is not tested at all so probably broken.
 +
* Note that BR2_REPRODUCIBLE has a cost, because you no longer have a record of who did the build or at what exact time the build was done.

Latest revision as of 02:23, 27 February 2020

Buildroot Developers Meeting, 3-5 February 2020, Brussels

The Buildroot Developers meeting is a 3-day event for Buildroot developers and contributors. It allows Buildroot developers and contributors to discuss the hot topics in the Buildroot development, work on patches, and generally meet each other, facilitating further online discussions. Attending the event is free, after registration. The Buildroot project has some travel funding for participants who would like to attend but cannot due to financial constraints.

Location and date

The next Buildroot Developers meeting will take place on February 3-5 2020 in Brussels, right after the FOSDEM conference. The meeting will take place in Google offices, located Chaussée d'Etterbeek 180, 1040 Brussels, very close to the Schuman metro station.

Sponsors

Google-logo.png

We would like to thank our sponsors:

  • Google, providing the meeting location, with Internet connection.

Participants

  1. Thomas Petazzoni
  2. Peter Korsgaard
  3. Angelo Compagnucci
  4. Arnout Vandecappelle
  5. Titouan Christophe
  6. Heiko Thiery (till Tuesday afternoon)
  7. Adam Duskett
  8. Michael Walle (till Tuesday afternoon)
  9. Thomas De Schampheleire (will be physically present on Tuesday and Wednesday only; could join remotely on Monday)
  10. Mark Corbin
  11. Yann E. MORIN
  12. Romain Naour
  13. Jochen Baltes (till Tuesday afternoon)

Meeting agenda

  • Autobuilder stats and dashboarding with Grafana (Titouan)
  • Continue Python2 deprecation (Titouan)
  • Rust/cargo download infrastructure: ripgrep currently does downloads during the build
    • Approach similar to what was discussed for Go?
  • Patch backlog management: shall we experiment with Merge Requests on gitlab? (Arnout)
    • Why? Is faster because the .gitlab-ci.yml checks are already executed, and for a bigger series of simple patches it reduces the time for review+merge from 30 minutes to 10 minutes.
    • Proposal: try it out with a few well-known developers, and only for "low-hanging fruit" series.
    • Thomas: nooooo :-)
    • Peter: I am also not really convinced
  • Status of top-level parallel build support, where do we go from there

Notes from the meeting

Google summer of code

  • First let's check if we have mentors. Nobody volunteers, so it doesn't make sense to register for it.
  • Last year we didn't invest a lot of time on mentoring. Results would have been better with better (more) mentoring. ThomasP had an intern around the same time and that intern needed a lot less mentoring time than our GSoC student, so really it depends on the student how much time is required and what results you can expect.
  • It is more efficient to mentor interns in a company than doing a GSoC project. The Buildroot Association could sponsor such endeavours, but we have to see how that would work on an admin/legal point of view.

Rust/cargo download infrastructure: ripgrep currently does downloads during the build

  • Approach similar to what was discussed for Go?
  • PHP has a similar thing (php-composer).
  • The problem: a cargo package has dependencies (typically lots of them, e.g. ripgrep which is relatively small has 46 dependencies in total (recursively)). Those dependencies need to be downloaded before we can start the build. At the moment, they are downloaded during the build step, automatically, by cargo. We would want that:
    • They are downloaded in the download step.
    • They are downloaded to DL_DIR, so they can be reused between different builds. Note that this requires locking.
    • They are properly hashed and verified. Note that cargo already takes care of this.
    • legal-info works, i.e. the tarballs and patches of dependencies are also put in the legal-info directory.
      • the licenses and license files still need to be put into the .mk file "manually", but whatever tooling is used (licensecheck, fossology) can use this tarball as input.
    • They are fetched from PRIMARY and BACKUP site.
    • PRIMARY_ONLY works.
    • It should be possible to patch the dependencies.
    • If several packages have the same dependency (at the same version), that dependency shouldn't be re-downloaded over and over again.
    • Ideally all the package manager can share some infrastructure.
  • Note that we will assume that all dependencies are "vendored": we're not going to try to make separate packages for all of the dependencies. That would be fighting too much with upstream.
  • Some things we want to keep explicit anyway: LICENSES and LICENSE_FILES. For these, however, we can have a script like scanpypi that generates it.
  • Four possible approaches were discussed:
    • If cargo supports it, we could convince it to use PRIMARY and BACKUP directly. In other words, the dependencies are fetched in a post-download hook. It may be impossible to implement all wanted features that way, or explicit support is needed for them (e.g. legal-info, patching dependencies). Everyone agrees this is not the way to go.
    • 'cargo' is a full download method like git, and it creates a single tarball like git with submodules. All the infrastructure for PRIMARY and legal-info can be reused. CARGO_HOME would be inside the tarball, so patches can be applied to dependencies without problem. The only limitation of this approach is that the last requirement (single download of dependencies) won't work. A nice thing about this is that as soon as it's a cargo package, it's automatically cargo download method. So converting existing packages is trivial and it's really easy on the user. We can still override the upstream-provided Cargo.lock file, but it would need some special treatment in the infra.
    • Make the dependencies explicit; don't rely on cargo to download them, but specify them as _EXTRA_DOWNLOADS (or something which automatically populates that). The normal download infrastructure is used for everything. This requires converting the Cargo.lock to a buildroot-specific format, but that can be automated. Although such a script is not going to be trivial,
    • Don't do anything. ripgrep as it is works, though it breaks all of the wish list we did above.
  • We conclude that the second option is the one we should go for.
    • There should be a new download method for cargo, go, composer.
    • This download method calls directly into the appropriate download helper to download the upstream package. The download method for that is still specified with _SITE_METHOD. It is passed along with an extra + in front of the URL, i.e. cargo+git+https://foo.org/proj/ and the cargo download helper just passes the last part down to the upstream download. Note that this download will not make use of PRIMARY or BACKUP, it only downloads from the original site. The result of this download is stored in a temporary file and will not be exposed - it will be removed when the temporary directory created by dl-wrapper is removed.
      • We could optimise things a bit by not creating a tarball at all when using the git/svn/hg download methods. But that can be done later.
    • There is no need to hash and verify this download - the final tarball will be hashed an verified.
    • The cargo download method extracts the upstream package, runs cargo on it with CARGO_HOME pointing inside the extracted directory, then makes a tarball of the result. That tarball will have the name of $(PKG)_SOURCE.
    • The tarball that includes all dependencies is used for everything else: PRIMARY and BACKUP, legal-info, offline builds, ...
  • Note that the RFC series http://patchwork.ozlabs.org/project/buildroot/list/?series=156166 can still be applied as an intermediate step.

Patch backlog management: shall we experiment with Merge Requests on gitlab? (Arnout)

  • Why? Is faster because the .gitlab-ci.yml checks are already executed, and for a bigger series of simple patches it reduces the time for review+merge from 30 minutes to 10 minutes.
  • Proposal: try it out with a few well-known developers, and only for "low-hanging fruit" series.
  • Thomas: nooooo :-)
  • Peter: I am also not really convinced
  • Yann: not in favour of it, by a long shot... :-/
  • Patchwork has a feature where you can create a bundle of a series and download it in one shot.
  • For the CI part, you can actually just push to a gitlab fork and point to the pipeline in the cover letter.
  • Having two possible workflows is going to be confusing for contributors.
  • One reason to do this would be to attract more contributions. However, we're really not lacking contributions at the moment...
  • Maybe we should also document more things about the workflow. E.g. putting the mailing list in nomail mode.

Conclusion: we're not going to do this.

Documentation improvements

  • It would be nice if we could use a workflow like the kernel, using ReStructuredText source and sphinx to generate output for it.
  • People do use the PDF output, so we certainly have to keep that.

Autobuilder stats and dashboarding with Grafana (Titouan)

  • PoC is live on one of Adam's servers. Log in is required to get in. Ask iTitou on IRC to get access.
  • We can move it to OSUOSL. We should also move the autobuild server there.

Continue Python2 deprecation (Titouan)

  • Upstream EOL of python2 has been delayed a bit, so we don't need to force deprecate immediately.
  • => Anyhow, this is just giving us 4 additional months, so no big changes on the roadmap
  • All patches for waf are now on patchwork.
  • Titouan keeps the wiki page up to date. (see https://elinux.org/Buildroot:Python2Packages)
  • Next step is to build host-python3 by default, and only build host-python if we build target python.
  • Principle going forward is that a package that needs python during the build will use python3, independent of the python that is on the target.
    • Packages that really need python2 during the build can of course still depend on host-python
    • Packages that just need a python during the build will depend on host-python3 and use $(HOST_DIR)/bin/python3
    • It is not required to change the existing packages right away. It's just a principle for anything new, or when a package is anyway modified.
    • host python modules will always use host-python3. So the host-python-package infra should be adapted to handle that.
    • Note that this means that when you build python2 for the target, you will most likely build both host-python and host-python3. So be it.

Status of top-level parallel build support, where do we go from there

  • Status is tracked in https://elinux.org/Buildroot:Top_Level_Parallel_Build
  • Package-specific issues (e.g. qt5, python modules)
  • make foo-reinstall is currently broken: if you reinstall a package foo which is also a dependency of another package bar, then bar will still have the old version of foo in its per-package directories. In the final rsync, the old foo may survive (depending on the order of the rsync).
  • Other issue when different packages are changing the same files, using append (i.e. not replacing the existing hardlink but changing the linked file, which is in the per-package directory of another package).
  • We could take one step: in the installation changing STAGING_DIR/TARGET_DIR to something else. There are probably some packages that still write to the STAGING_DIR set during configure, so (for the time being) we should still use the "old" staging dir as well. But it is a good step towards identifying conflicts.

Status of reproducible builds support, where do we go from there

  • Current status: autobuilder will set BR2_REPRODUCIBLE and do a second build and check if they are identical
  • The output is not always so clear. E.g. the reason is "unknown".
  • diffoscope is a useful tool: https://diffoscope.org/
  • Use case for ThomasDS: when after a few years you rebuild an old version, you want to be sure that you are really looking at the same thing. Binary identical helps to be sure.
  • For such use cases, you really should still store your build environment, and build in the same path. What works well at the moment is variation of user and time. Variation of build directory is not so great, and variation of build environment (distro) is not tested at all so probably broken.
  • Note that BR2_REPRODUCIBLE has a cost, because you no longer have a record of who did the build or at what exact time the build was done.