David Aronchick (aronchick@gmail.com)
2022-12-07 04:14:09

@David Aronchick has joined the channel

Julien Le Dem (julien@apache.org)
2022-12-07 04:14:20

@Julien Le Dem has joined the channel

Enrico Rotundo (enrico.rotundo@gmail.com)
2022-12-07 04:14:20

@Enrico Rotundo has joined the channel

David Aronchick (aronchick@gmail.com)
2022-12-07 04:14:50

Hi all! I’ve created this channel to chat about bringing Open Lineage to Bacalhau (https://www.bacalhau.org/)

bacalhau.org
David Aronchick (aronchick@gmail.com)
2022-12-07 04:15:23

the idea would be at every step of a DAG execution, we automatically read the inputs (if any) and create a metadata file with open lineage information in it

David Aronchick (aronchick@gmail.com)
2022-12-07 04:18:30

Enrico has done some early thinking on this - https://www.notion.so/pl-strflt/Initial-design-doc-Oct-22-d2b032bd16e340d3ada39171c9ad524d

PL-STRFLT on Notion
David Aronchick (aronchick@gmail.com)
2022-12-07 04:19:06

and in parallel created an Airflow operator -> https://github.com/enricorotundo/bacalhau-airflow-provider

Website
<https://github.com/filecoin-project/bacalhau>
Stars
1
David Aronchick (aronchick@gmail.com)
2022-12-07 04:19:12

lemme know how i can help!

David Aronchick (aronchick@gmail.com)
2022-12-07 04:19:30

cc @Enrico Rotundo @Julien Le Dem

Enrico Rotundo (enrico.rotundo@gmail.com)
2022-12-07 07:45:04

hey @Julien Le Dem glad to meet you! The TLDR; is we’re going to use Airflow to orchestrate Bacalhau pipelines, and would love to add open-linage to pipelines too. I see Marquez integrates already with Airflow so that may be a good fit!

Philippe (philippe@polyphene.io)
2022-12-07 08:40:28

@Philippe has joined the channel

Julien Le Dem (julien@apache.org)
2022-12-07 22:03:29

Hello @Enrico Rotundo nice to meet you as well. FYI we have the monthly meeting on zoom tomorrow if you guys want to join.

Julien Le Dem (julien@apache.org)
2022-12-07 23:35:21

https://openlineage.slack.com/archives/C01CK9T7HKR/p1670432586277209

} Michael Robinson (https://openlineage.slack.com/team/U02LXF3HUN7)
Enrico Rotundo (enrico.rotundo@gmail.com)
2022-12-08 05:22:55

thanks for sharing that! I’ll join you!

Enrico Rotundo (enrico.rotundo@gmail.com)
2023-02-08 04:54:05

Hi @Julien Le Dem As I’m starting to build an airflow operator for Bacalhau (which will include support for OpenLineage 🙂), I was wondering if you could share your knowledge about building operators. Why did you place the current operator in the OpenLineage repo rather than raising a PR to the “official” community-built Airflow repo? ~Is there any specific reason (community guidelines, technical, etc.)?~

Enrico Rotundo (enrico.rotundo@gmail.com)
2023-02-08 04:59:43

*Thread Reply:* Oh now that I’m reading your new operator draft AIP it all makes sense

Enrico Rotundo (enrico.rotundo@gmail.com)
2023-02-08 05:21:02

*Thread Reply:* ok so at this point the question is… for newborn operators, do you suggest to start with their own package or try to merge into airflow.providers directly?

Julien Le Dem (julien@apache.org)
2023-02-08 20:42:34

*Thread Reply:* I think you can create your own Provider package with your operator. This is more a question for the airflow mailing list.

Julien Le Dem (julien@apache.org)
2023-02-08 20:43:25

*Thread Reply:* I would recommend this to add lineage support for your operator: https://openlineage.io/docs/integrations/airflow/operator/ https://openlineage.io/docs/integrations/airflow/extractors/default-extractors

openlineage.io
openlineage.io
Julien Le Dem (julien@apache.org)
2023-02-08 20:43:41

*Thread Reply:* And nice to heare from you @Enrico Rotundo!

Julien Le Dem (julien@apache.org)
2023-02-08 20:44:11

*Thread Reply:* The next monthly meeting is tomorrow if you want to join

Julien Le Dem (julien@apache.org)
2023-02-08 20:45:05

*Thread Reply:* I added you to the invite just in case

Julien Le Dem (julien@apache.org)
2023-02-08 20:46:17

*Thread Reply:* To answer your original question, we started outside the Airflow community, now that the project is more established it is easier to get this approved. Hopping to get this to a vote soon

🙌 Enrico Rotundo
Mike Dillion (mike.dillion@gmail.com)
2023-02-11 18:51:40

@Mike Dillion has joined the channel

jrich (jasonrich85@icloud.com)
2023-03-10 14:52:21

@jrich has joined the channel

Nam Nguyen (nam@astrafy.io)
2023-07-14 05:37:45

@Nam Nguyen has joined the channel

Silvia Pina (silviampina@gmail.com)
2023-07-15 12:10:17

@Silvia Pina has joined the channel

YYYY XXXX (mail4registering@gmail.com)
2023-07-17 21:21:36

@YYYY XXXX has joined the channel

patrice quillevere (patrice.quillevere.csgroup.eu@gmail.com)
2023-07-18 11:11:13

@patrice quillevere has joined the channel

Rodrigo Maia (rodrigo.maia@manta.io)
2024-01-03 03:43:46

@Rodrigo Maia has joined the channel

jayant joshi (itsjayantjoshi@gmail.com)
2024-01-24 01:10:30

@jayant joshi has joined the channel

Santiago Cobos (santiago.cobos@ibm.com)
2024-03-25 16:42:35

@Santiago Cobos has joined the channel

Nicolas Pittion-Rossillon (nicolas.pittion@redlab.io)
2024-04-18 03:14:45

@Nicolas Pittion-Rossillon has joined the channel

Farnaz Salamatjoo (farnaz.salamatjoo1988@gmail.com)
2024-05-13 02:38:06

@Farnaz Salamatjoo has joined the channel

Rishabh Pareek (rishabh.pareek@infoobjects.com)
2024-06-03 01:01:13

@Rishabh Pareek has joined the channel

Yuanli Wang (yuanliw@bu.edu)
2024-06-09 01:33:55

@Yuanli Wang has joined the channel