.. _gitlab_file_import: GitLab import from a file ========================= GitLab provides option to `import a new project from a file `_. This option is created for migrating projects from one GitLab instance to another. In case of Pagure to GitLab importer we need to adapt Pagure data to file format used by GitLab. This document will investigate that option. GitLab file format ------------------ For purpose of investigation of the GitLab export format I tried to export `test project `_ I created during the investigation of GitLab API. See :doc:`gitlab`. The export will generate one archive in `tar.gz` format. This archive contains a following directory structure: .. code-block:: 2023-01-20_11-48-813_testgroup519_arc_export ├── GITLAB_REVISION ├── GITLAB_VERSION ├── lfs-objects │ └── 45a5d77993d525cdda15d08e63c34339a1bf49a43756a05908082bb04b4c4087 ├── lfs-objects.json ├── project.bundle ├── project.design.bundle ├── snippets ├── tree │ ├── project │ │ ├── auto_devops.ndjson │ │ ├── boards.ndjson │ │ ├── ci_cd_settings.ndjson │ │ ├── ci_pipelines.ndjson │ │ ├── container_expiration_policy.ndjson │ │ ├── custom_attributes.ndjson │ │ ├── error_tracking_setting.ndjson │ │ ├── external_pull_requests.ndjson │ │ ├── issues.ndjson │ │ ├── labels.ndjson │ │ ├── merge_requests.ndjson │ │ ├── metrics_setting.ndjson │ │ ├── milestones.ndjson │ │ ├── pipeline_schedules.ndjson │ │ ├── project_badges.ndjson │ │ ├── project_feature.ndjson │ │ ├── project_members.ndjson │ │ ├── prometheus_metrics.ndjson │ │ ├── protected_branches.ndjson │ │ ├── protected_environments.ndjson │ │ ├── protected_tags.ndjson │ │ ├── push_rule.ndjson │ │ ├── releases.ndjson │ │ ├── security_setting.ndjson │ │ ├── service_desk_setting.ndjson │ │ └── snippets.ndjson │ └── project.json ├── uploads │ └── 8b4f7247f154d0b77c5d7d13e16cb904 │ └── Infra___Releng_2022.jpg └── VERSION 7 directories, 35 files Following is the explanation of some of the files found in the archive: - GitLab metadata files (version and revision) - `.bundle` file which is created by `git bundle `_ command. You can easily look at the content of `.bundle` file by using `git clone` command. - `.design.bundle` contains all the attachments from issues and merge requests. It is a repository file bundled by `git bundle `_ command. - `lsf-object.json` contains list of hashes of designs and their mapping to issue id or merge request id. This is something we can skip, because Pagure doesn't have this feature. - `VERSION` file contains version, but I was not able what this version refers to. My assumption is that it's version of the export tool. - `lfs-objects/` folder contains all the designs named by hash. This is something we can skip, because Pagure doesn't have this feature. - `snippets/` folder contains `GitLab snippets `_. - `tree/project.json` file contains all the project metadata in JSON format. - `tree/project/` contains files in `ndjson format `_ describing various objects defined in GitLab project. For purpose of this investigation only `issues.ndjson` and `merge_requests.ndjson` are important for us. - `uploads/` folder contains all the attachments from issues or merge requests. Conversion of Pagure project to GitLab file formats --------------------------------------------------- For purpose of the investigation I tried to convert `ARC project `_ hosted on Pagure to GitLab import format. For this purpose I started with the export generated by GitLab and changed files to correspond to what I want to import. Here is the list of all files that I needed to prepare and their content with explanation: - `project.bundle` is a binary bundle file created by `git bundle `_ command. It was created by running `git bundle create project.bundle --all` inside ARC project repository. - `tree/project/issues.ndjson` contains issues description in `ndjson format `_. The file contains `project_id` or `author_id` set to 0, instead it contains `author` object with FAS username and public FAS e-mail. Unfortunately if the `author_id` isn't recognized by GitLab it will create the issue or comment as a user who is providing the import, completely ignoring the author object in JSON. .. code-block:: json {"title":"Investigate the GitLab API for Pagure to Gitlab importer","author_id":0,"author":{"username": "zlopez","email": "michal.konecny@pacse.eu"},"project_id":42729361,"created_at":"2023-01-19T11:41:40.000Z","updated_at":"2023-01-19T14:06:47.659Z","description":"Investigate the GitLab API for Pagure to Gitlab importer ARC investigation. This ticket will also work as a test ticket in investigation.","iid":1,"updated_by_id":null,"weight":null,"confidential":false,"due_date":null,"lock_version":0,"time_estimate":0,"relative_position":513,"last_edited_at":null,"last_edited_by_id":null,"discussion_locked":null,"closed_at":"2023-01-19T14:06:47.641Z","closed_by_id":3072529,"health_status":null,"external_key":null,"issue_type":"issue","state":"closed","events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T13:07:11.164Z","updated_at":"2023-01-19T13:07:11.164Z","action":"created","target_type":"Issue","fingerprint":null},{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T14:06:47.712Z","updated_at":"2023-01-19T14:06:47.712Z","action":"closed","target_type":"Issue","fingerprint":null}],"timelogs":[],"notes":[{"note":"Here's a sample comment as you requested @zlopez.","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-19T12:59:59.000Z","updated_at":"2023-01-19T12:59:59.000Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":false,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"f98cdeabaaec68ae453e1dbf5d9e535fbbcede0a","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-19T12:59:59.000Z","author":{"name":"Zlopez"},"award_emoji":[],"events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T13:13:21.071Z","updated_at":"2023-01-19T13:13:21.071Z","action":"commented","target_type":"Note","fingerprint":null}]}],"label_links":[],"resource_label_events":[],"resource_milestone_events":[],"resource_state_events":[{"user_id":3072529,"created_at":"2023-01-19T14:06:47.734Z","state":"closed","source_commit":null,"close_after_error_tracking_resolve":false,"close_auto_resolve_prometheus_alert":false}],"designs":[],"design_versions":[],"issue_assignees":[],"zoom_meetings":[],"award_emoji":[],"resource_iteration_events":[]} {"title":"Test open issue","author_id":0,"author":{"username": "akashdeep","email": "akashdeep.dhar@gmail.com"},"project_id":42729361,"created_at":"2023-01-19T14:07:05.823Z","updated_at":"2023-01-20T11:48:02.495Z","description":"Test open issue","iid":2,"updated_by_id":null,"weight":null,"confidential":false,"due_date":null,"lock_version":0,"time_estimate":0,"relative_position":1026,"last_edited_at":null,"last_edited_by_id":null,"discussion_locked":null,"closed_at":null,"closed_by_id":null,"health_status":null,"external_key":null,"issue_type":"issue","state":"opened","events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-19T14:07:05.930Z","updated_at":"2023-01-19T14:07:05.930Z","action":"created","target_type":"Issue","fingerprint":null}],"timelogs":[],"notes":[{"note":"![Infra___Releng_2022](/uploads/8b4f7247f154d0b77c5d7d13e16cb904/Infra___Releng_2022.jpg)","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-20T11:48:02.435Z","updated_at":"2023-01-20T11:48:02.435Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":false,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"30302c7dee98663fcfca845a2ec2715eb3e35e4f","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-20T11:48:02.435Z","author":{"name":"Zlopez"},"award_emoji":[],"events":[{"project_id":42729361,"author_id":3072529,"created_at":"2023-01-20T11:48:02.617Z","updated_at":"2023-01-20T11:48:02.617Z","action":"commented","target_type":"Note","fingerprint":null}]},{"note":"added [1 design](/testgroup519/arc/-/issues/2/designs?version=490993)","noteable_type":"Issue","author_id":3072529,"created_at":"2023-01-19T14:07:45.310Z","updated_at":"2023-01-19T14:07:45.315Z","project_id":42729361,"attachment":{"url":null},"line_code":null,"commit_id":null,"st_diff":null,"system":true,"updated_by_id":null,"type":null,"position":null,"original_position":null,"resolved_at":null,"resolved_by_id":null,"discussion_id":"e15e7c584cc7e6c7e298529f034f0b55eeacca90","change_position":null,"resolved_by_push":null,"confidential":null,"last_edited_at":"2023-01-19T14:07:45.315Z","author":{"name":"Zlopez"},"award_emoji":[],"system_note_metadata":{"commit_count":null,"action":"designs_added","created_at":"2023-01-19T14:07:45.343Z","updated_at":"2023-01-19T14:07:45.343Z"},"events":[]}],"label_links":[],"resource_label_events":[],"resource_milestone_events":[],"resource_state_events":[],"designs":[{"project_id":42729361,"filename":"Infra___Releng_2022.jpg","relative_position":null,"iid":1,"notes":[]}],"design_versions":[{"sha":"69419c215f53d401c1b3c451e6fc08e3351d2679","created_at":"2023-01-19T14:07:45.233Z","author_id":3072529,"actions":[{"event":"creation","design":{"project_id":42729361,"filename":"Infra___Releng_2022.jpg","relative_position":null,"iid":1}}]}],"issue_assignees":[],"zoom_meetings":[],"award_emoji":[],"resource_iteration_events":[]} Importing the archive to GitLab ------------------------------- Archive for the migration is prepared by executing `tar -czvf test_arc_export.tar.gz .` command. This needs to be executed in the root folder of the prepared file structure, otherwise the import will fail with `No such file or directory`. To import the archive to GitLab `API call `_ could be used. Here is the full API call made by `curl`: .. code-block:: curl --request POST --header "PRIVATE-TOKEN: XXX" --form "namespace=testgroup519" --form "path=arc2" --form "file=@test_arc_export.tar.gz" "https://gitlab.com/api/v4/projects/import" To check for any error in the import use GitLab `import status API call `_. This could be made by `curl`: .. code-block:: curl --header "PRIVATE-TOKEN: XXX" "https://gitlab.com/api/v4/projects//import" Conclusion ---------- At this point I ended up with the investigation, because the situation is the same as in case of using API. Which is much more convenient to use and provides a better response in case of errors (I spent two days trying to debug `No such file or directory [FILTERED]` error message).