-
Notifications
You must be signed in to change notification settings - Fork 41
Support .gitattributes export-ignore in distribution archives #515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
15f33ee
892f343
7864bf5
1342036
0988a73
f5f00f6
1a002b0
5dad686
f871732
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -110,6 +110,31 @@ The full documentation of this command is available with | |
| dune-release help distrib | ||
| ``` | ||
|
|
||
| #### Excluding files with .gitattributes | ||
|
|
||
| Files marked with the `export-ignore` attribute in `.gitattributes` will be excluded from the distribution archive. This can be used to exclude development files like `dune-workspace` that should not be included in releases. | ||
|
|
||
| Example `.gitattributes`: | ||
| ``` | ||
| dune-workspace export-ignore | ||
| .github/** export-ignore | ||
| ``` | ||
|
|
||
| **Supported patterns:** | ||
| - Exact filenames: `dune-workspace` | ||
| - Directory patterns: `.github/**` (matches all files under `.github/`) | ||
| - Glob patterns: `*.log`, `test_*`, `file?.txt` | ||
| - Double star in path: `**/build`, `src/**/test.ml` | ||
| - Path normalization: handles `./` and `../` in paths | ||
|
|
||
| **Not supported:** | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be good to state what "not supported" means in practice. Will it error? Will it ignore the pattern?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. I have pushed a commit that detects the unsupported pattern syntax, and updated the README to distinguish such cases (that are skipped with a warning) and other limitations (that are silent). |
||
| - Negation patterns (`!pattern`) | ||
| - Escaped patterns (`\!` for literal `!`) | ||
| - Quoted patterns (`"a b"` for patterns with spaces) | ||
| - Character classes (`[abc]`) | ||
| - Case insensitivity (`core.ignorecase`) | ||
| - `.gitattributes` files in subdirectories | ||
|
|
||
|
|
||
| ### Publish the distribution online | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -117,33 +117,45 @@ module Tar = struct | |
| String.concat (List.rev (end_of_file :: t)) | ||
| end | ||
|
|
||
| let path_set_of_dir dir ~exclude_paths = | ||
| let not_excluded p = Ok (not (Fpath.Set.mem (Fpath.base p) exclude_paths)) in | ||
| let path_set_of_dir dir ~exclude_paths ~export_ignore = | ||
| let not_excluded p = | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To avoid the negation, maybe rename it to
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Renamed. My thinking was that conceptually what is being done is "excluding" and so kept that base term, but fair point about double-negation. |
||
| if Fpath.Set.mem (Fpath.base p) exclude_paths then Ok false | ||
| else | ||
| match Fpath.rem_prefix dir p with | ||
| | None -> Ok true | ||
| | Some rel_path -> | ||
| Ok (not (List.exists (Gitattributes.matches rel_path) export_ignore)) | ||
| in | ||
| let traverse = `Sat not_excluded in | ||
| let elements = `Sat not_excluded in | ||
| let err _ e = e in | ||
| OS.Dir.contents ~dotfiles:true dir | ||
| >>= OS.Path.fold ~dotfiles:true ~err ~elements ~traverse Fpath.Set.add | ||
| Fpath.Set.empty | ||
|
|
||
| let tar dir ~exclude_paths ~root ~mtime = | ||
| let tar dir ~exclude_paths ~export_ignore ~root ~mtime = | ||
| let tar_add file tar = | ||
| let fname = | ||
| match Fpath.rem_prefix dir file with | ||
| | None -> assert false | ||
| | Some file -> Fpath.(root // file) | ||
| in | ||
| Logs.info (fun m -> m "Archiving %a" Fpath.pp fname); | ||
| tar >>= fun tar -> | ||
| OS.Dir.exists file >>= function | ||
| | true -> Tar.add tar fname ~mode:0o775 ~mtime `Dir | ||
| | true -> | ||
| (* Skip directories - they will be created implicitly when their | ||
| contents are added. This ensures that directories whose contents | ||
| are excluded via export-ignore patterns don't appear as empty | ||
| directories in the archive. *) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But isn't this what one would expect to do? If I exclude a directory, I would expect the directory to show up, but not its contents. Git does this but its more to do with the fact that Git doesn't track directories at all, so an empty directory is unrepresentable. But
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My thinking is that the context here is reading a |
||
| Ok tar | ||
| | false -> | ||
| Logs.info (fun m -> m "Archiving %a" Fpath.pp fname); | ||
| OS.Path.Mode.get file >>= fun mode -> | ||
| OS.File.read file >>= fun contents -> | ||
| let mode = if 0o100 land mode > 0 then 0o775 else 0o664 in | ||
| Tar.add tar fname ~mode ~mtime (`File contents) | ||
| in | ||
| path_set_of_dir dir ~exclude_paths >>= fun fset -> | ||
| path_set_of_dir dir ~exclude_paths ~export_ignore >>= fun fset -> | ||
| Fpath.Set.fold tar_add fset (Ok Tar.empty) >>| fun tar -> Tar.to_string tar | ||
|
|
||
| (* Bzip2 compression and unarchiving *) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,16 +13,19 @@ open Bos_setup | |
| val tar : | ||
| Fpath.t -> | ||
| exclude_paths:Fpath.set -> | ||
| export_ignore:Gitattributes.pattern list -> | ||
| root:Fpath.t -> | ||
| mtime:int64 -> | ||
| (string, R.msg) result | ||
| (** [tar dir ~exclude_paths ~root ~mtime] is a (us)tar archive that contains the | ||
| file hierarchy [dir] except the relative hierarchies present in | ||
| [exclude_paths]. In the archive, members of [dir] are rerooted at [root] and | ||
| sorted according to {!Fpath.compare}. They have their modification time set | ||
| to [mtime] and their file permissions are [0o775] for directories and files | ||
| executable by the user and [0o664] for other files. No other file metadata | ||
| is preserved. | ||
| (** [tar dir ~exclude_paths ~export_ignore ~root ~mtime] is a (us)tar archive | ||
| that contains the file hierarchy [dir] except: | ||
| - relative hierarchies present in [exclude_paths] (basename matching) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if it wouldn't make more sense to just translate
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That could make sense. My worry is that |
||
| - files matching patterns in [export_ignore] (from [.gitattributes]) | ||
|
|
||
| In the archive, members of [dir] are rerooted at [root] and sorted according | ||
| to {!Fpath.compare}. They have their modification time set to [mtime] and | ||
| their file permissions are [0o775] for directories and files executable by | ||
| the user and [0o664] for other files. No other file metadata is preserved. | ||
|
|
||
| {b Note.} This is a pure OCaml implementation, no [tar] tool is needed. *) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| open Bos_setup | ||
|
|
||
| type pattern = | ||
| | Exact of string (** Exact match against basename or full path. *) | ||
| | Prefix of string | ||
| (** Pattern like [dir/**] that matches everything under a directory, but | ||
| not the directory itself. *) | ||
| | Glob of Re.re (** Compiled glob pattern. *) | ||
|
|
||
| (** [glob_to_re pattern] is a compiled regex for glob [pattern]. Supports [*] | ||
| (any chars except /), [?] (single char except /), and [**] (any path | ||
| segments, but only when adjacent to /). *) | ||
| let glob_to_re pattern = | ||
| let buf = Buffer.create (String.length pattern * 2) in | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the initial size
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dropped the |
||
| Buffer.add_char buf '^'; | ||
| let len = String.length pattern in | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nitpick but the length of the pattern is determined twice, so could be pulled up in the function an reused.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. |
||
| let rec loop i = | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if it wouldn't make sense to use a
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That could be done. But I think that using a Buffer for building a string char-by-char is idiomatic, and the mutable state is encapsulated within a single function. |
||
| if i >= len then () | ||
| else | ||
| let c = pattern.[i] in | ||
| match c with | ||
| | '*' -> | ||
| if i + 1 < len && pattern.[i + 1] = '*' then | ||
| (* ** only crosses path separators when adjacent to / *) | ||
| let preceded_by_slash = i > 0 && pattern.[i - 1] = '/' in | ||
| let at_start = i = 0 in | ||
| if i + 2 < len && pattern.[i + 2] = '/' then ( | ||
| (* **/ matches zero or more directories *) | ||
| Buffer.add_string buf "(.*/)?"; | ||
| loop (i + 3)) | ||
| else if i + 2 >= len && (preceded_by_slash || at_start) then ( | ||
| (* /** at end or just ** alone - matches anything *) | ||
| Buffer.add_string buf ".*"; | ||
| loop (i + 2)) | ||
| else ( | ||
| (* ** not adjacent to / - acts like * *) | ||
| Buffer.add_string buf "[^/]*"; | ||
| loop (i + 2)) | ||
| else ( | ||
| (* * matches anything except path separator *) | ||
| Buffer.add_string buf "[^/]*"; | ||
| loop (i + 1)) | ||
| | '?' -> | ||
| (* ? matches any single character except path separator *) | ||
| Buffer.add_string buf "[^/]"; | ||
| loop (i + 1) | ||
| | '.' | '+' | '^' | '$' | '(' | ')' | '[' | ']' | '{' | '}' | '|' | '\\' | ||
| -> | ||
| (* Escape regex metacharacters *) | ||
| Buffer.add_char buf '\\'; | ||
| Buffer.add_char buf c; | ||
| loop (i + 1) | ||
| | _ -> | ||
| (* Literal character *) | ||
| Buffer.add_char buf c; | ||
| loop (i + 1) | ||
| in | ||
| loop 0; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it could be more maintainable to use
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe, but I don't know. It would add a build dependency and generated file, and I don't expect that the syntax of globs is likely to change often. It could just as easily be done later, but I don't know if I will have time to work on that soon. |
||
| Buffer.add_char buf '$'; | ||
| Re.Pcre.regexp (Buffer.contents buf) | ||
|
|
||
| let parse_pattern s = | ||
| let s = String.trim s in | ||
| (* Remove leading slash if present - we always match relative paths *) | ||
| let s = | ||
| if String.is_prefix ~affix:"/" s then | ||
| String.Sub.to_string (String.sub ~start:1 s) | ||
| else s | ||
| in | ||
| let has_wildcard s = String.exists (fun c -> c = '*' || c = '?') s in | ||
| if String.is_suffix ~affix:"/**" s then | ||
| (* Directory pattern: match everything under the directory *) | ||
| let prefix = | ||
| String.Sub.to_string (String.sub ~stop:(String.length s - 3) s) | ||
| in | ||
| if has_wildcard prefix then | ||
| (* Prefix contains wildcards, treat whole pattern as glob *) | ||
| Glob (glob_to_re s) | ||
| else Prefix prefix | ||
| else if has_wildcard s then | ||
| (* Has wildcards - compile as glob *) | ||
| Glob (glob_to_re s) | ||
| else | ||
| (* Exact match *) | ||
| Exact s | ||
|
|
||
| let matches path pattern = | ||
| let path = Fpath.normalize path in | ||
| let path_str = Fpath.to_string path in | ||
| let basename = Fpath.basename path in | ||
| match pattern with | ||
| | Exact s -> | ||
| (* Match against basename or full relative path *) | ||
| String.equal s basename || String.equal s path_str | ||
| | Prefix prefix -> | ||
| (* Match everything under the directory, but not the directory itself *) | ||
| String.is_prefix ~affix:(prefix ^ "/") path_str | ||
| | Glob re -> | ||
| (* Match against full path or basename for patterns like *.log *) | ||
| Re.execp re path_str || Re.execp re basename | ||
|
|
||
| let parse_export_ignore content = | ||
| (* Strip UTF-8 BOM if present at start of file *) | ||
| let content = | ||
| if String.is_prefix ~affix:"\xef\xbb\xbf" content then | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest putting
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
| String.Sub.to_string (String.sub ~start:3 content) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And use
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
| else content | ||
| in | ||
| content |> String.cuts ~sep:"\n" | ||
| |> List.filter_map (fun line -> | ||
| let line = String.trim line in | ||
| (* Skip empty lines and comments *) | ||
| if String.length line = 0 || String.is_prefix ~affix:"#" line then None | ||
| else | ||
| (* Format: <pattern> <attr1> <attr2> ... | ||
| Attributes can be separated by spaces or tabs *) | ||
| let parts = | ||
| String.fields ~empty:false | ||
| ~is_sep:(fun c -> c = ' ' || c = '\t') | ||
| line | ||
| in | ||
| match parts with | ||
| | pattern :: attrs | ||
| when List.exists (String.equal "export-ignore") attrs -> | ||
| Some (parse_pattern pattern) | ||
| | _ -> None) | ||
|
|
||
| let read_export_ignore dir = | ||
| let file = Fpath.(dir / ".gitattributes") in | ||
| OS.File.exists file >>= function | ||
| | false -> Ok [] | ||
| | true -> OS.File.read file >>| parse_export_ignore | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,32 @@ | ||||||
| (** Gitattributes parsing for export-ignore. | ||||||
|
|
||||||
| Parses [.gitattributes] files and extracts patterns marked with the | ||||||
| [export-ignore] attribute. These patterns can be used to exclude files from | ||||||
| distribution archives. *) | ||||||
|
|
||||||
| open Bos_setup | ||||||
|
|
||||||
| (** {1 Patterns} *) | ||||||
|
|
||||||
| type pattern | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||||||
| (** The type for gitattributes patterns. *) | ||||||
|
|
||||||
| val parse_pattern : string -> pattern | ||||||
| (** [parse_pattern s] is the pattern parsed from string [s]. Supports: | ||||||
| - Exact matches: [filename] | ||||||
| - Directory patterns: [dir/**] | ||||||
| - Glob patterns: [*.ext], [prefix*] *) | ||||||
|
|
||||||
| val matches : Fpath.t -> pattern -> bool | ||||||
| (** [matches path pattern] holds if [path] matches [pattern]. [path] should be | ||||||
| relative to the repository root. *) | ||||||
|
|
||||||
| (** {1 Parsing .gitattributes} *) | ||||||
|
|
||||||
| val parse_export_ignore : string -> pattern list | ||||||
| (** [parse_export_ignore content] is the list of patterns marked with | ||||||
| [export-ignore] in [.gitattributes] file [content]. *) | ||||||
|
|
||||||
| val read_export_ignore : Fpath.t -> (pattern list, R.msg) result | ||||||
| (** [read_export_ignore dir] is the list of patterns marked with [export-ignore] | ||||||
| in [dir/.gitattributes], or the empty list if the file doesn't exist. *) | ||||||
Uh oh!
There was an error while loading. Please reload this page.