* Improving support for name changes in git @ 2023-04-04 18:00 Bran Hagger 2023-04-06 1:59 ` Junio C Hamano 2023-04-26 18:35 ` Gwyneth Morgan 0 siblings, 2 replies; 3+ messages in thread From: Bran Hagger @ 2023-04-04 18:00 UTC (permalink / raw) To: git@vger•kernel.org; +Cc: Emily Shaffer Hello Git community, I'm interested in volunteering to help improve the process for users changing their name in Git. I've seen the notes from the Git summit[1] and the old proposal to change the .mailmap to use hashes instead of plaintext names[2]. The problem with both approaches is that it is easy for other users to figure out the old name, which is a privacy concern for many people who change their names. Since the reverse of the hashes in the second case can be easily brute-forced, using hashes in the .mailmap provides no additional protection. A system that prevents people from reverse-engineering the old name of a user who changes their name would require two key components: 1. The method of determining the current name of the author of a git commit can not rely on any information derived from their old name. 2. The mapping to the current name of the author of a git commit can not contain any history. Solving the first problem seems reasonably doable. Instead of each commit having an author name and email, the author section could contain a hash that is used for the mapping. To maintain compatibility with older versions of git, the format could look something like: Author: Hash #user.idHash <email@lookIn•newMailmap> With the user.idHash is a randomly generated number set in the .gitconfig the same way user.name and user.email currently are. A .newMailmap file (or whatever name we choose to give it) would then map from user id hashes to user names and emails. The second problem of how to maintain a mapping of user.idHash without history is a radical departure from how git currently works. While handling such a file on the client side is probably not too technically complicated, it raises several questions: * How can a git repository accept changes and protect against malicious actors modifying the .newMailmap file (or however we choose to name it)? Making pull requests to modify the file and keeping those pull requests around recreates the old issue of having a record of every name change. * How are merge conflicts handled? * How do we ensure users can only set the name and email for their own hashes? If the commits are signed this could be done via signing verification, but my understanding is that signing commits is relatively rare. Has there been any further work done on supporting git name changes that I missed? Are there any existing files without git history that face similar issues? [1] https://code.googlesource.com/git/summit/2020/+/main/notes.md [2] https://lore.kernel.org/git/20210103211849.2691287-1-sandals@crustytoothpaste.net/ Thank you, Bran (he/him) P.S. Apologies for potentially double-sending this. My email client accidentally added HTML to the first copy. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Improving support for name changes in git 2023-04-04 18:00 Improving support for name changes in git Bran Hagger @ 2023-04-06 1:59 ` Junio C Hamano 2023-04-26 18:35 ` Gwyneth Morgan 1 sibling, 0 replies; 3+ messages in thread From: Junio C Hamano @ 2023-04-06 1:59 UTC (permalink / raw) To: Bran Hagger; +Cc: git@vger•kernel.org, Emily Shaffer Bran Hagger <brhagger@microsoft•com> writes: > I'm interested in volunteering to help improve the process for > users changing their name in Git. To "improve", we need to understand what these users want when they change their names. Changing the names and changing the e-mail addresses are both commonly done, and people depending on circumstances want different things from the tool. Some do not want to be known that the person who used to use that old name is you, the person who uses this new name. Some do not mind their old name or address to be in the record but they want to take credit for what they did under both names. There may be some position in between, with various degrees of being realistic (e.g. "I do not want to be associated with the old commit, but at the same time I do want to take credit for it"---is that a reasonable desire?). > Solving the first problem seems reasonably doable.... Up to this point, I found what you wrote to be reasoned very nicely. However, ... > The second problem of how to maintain a mapping of user.idHash > without history is a radical departure from how git currently > works. ... I think the above is an understatement. No "radical departure" would change the fundamental issue here: people need to be able to map the random token X to the "current name" right now, and the mechanism used to do so can be replayed at later date because a mapping will be distributed, copied and saved. Or a much simpler and obvious source of the problem is that people have memories. People change names and addresses over the course of their lives. Employers may encourage their employees to use their corporate ident when contributing to an external project, and often their employment contract would make it clear that rights to the work belong to the employers. When employees move on, old contributions need to stay to be "owned" by the original user ident. Side note: when changing an employer, people may more often change the address but not names. But technically names and addresses as part of author/committer ident have the same characteristics in Git (e.g. being part of a etched-in-stone identity string), and the address is much less loaded emotionally, I'll talk about address change in this paragraph, but the same discussion applies to name change. Some of these employees may not mind letting others know that the person who made these old contributions and the person who is making new contributions under different name and/or address are the same person. Others may be ashamed of their past association to the $EVIL company and may want to start afresh, without being known about their past employment with them. The mailmap mechanism is a great way for the former group of folks. It allows them to group the contributions by such a person who had multiple idents over time into a single bucket. But the mechanism may not be suited for other uses, including the latter. Some folks, after changing their name and/or address, do not want it to be known that they used to use that name and/or address (e.g. they may be a victim of a crime, being stalked, etc.) The mailmap mechanism would not help, even with your "random token" redirection, and it shouldn't, because for those folks, they do not want to be associated with their old ident after they started using new one. The idea to use mailmap to somehow "link" the author of these old commits (made under old ident) and the author of the new commits (made under new ident of the same physical person who wants the association with the old ident not to be known) _creates_ the problem of "the linkage between two idents, which was made with clever use of random token to make it irreversible, can be recovered". If "Such and such person used to work at $CORP and made these contribution" was publically known as a fact before the person changed their name and/or address, it is impossible to force all other people to forget. Wouldn't the only practical solution be to stick to your new ident, and not talk about the old ident you used to use? If you try to abuse mailmap for something it wasn't even designed to and have any entry to link the old and new ident in some way, isn't it backwards as a solution, when what you want is that the linkage between the old ident and you not to be known? ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Improving support for name changes in git 2023-04-04 18:00 Improving support for name changes in git Bran Hagger 2023-04-06 1:59 ` Junio C Hamano @ 2023-04-26 18:35 ` Gwyneth Morgan 1 sibling, 0 replies; 3+ messages in thread From: Gwyneth Morgan @ 2023-04-26 18:35 UTC (permalink / raw) To: Bran Hagger; +Cc: git@vger•kernel.org, Emily Shaffer, brian m. carlson On 2023-04-04 18:00:00+0000, Bran Hagger wrote: > Has there been any further work done on supporting git name changes that I missed? Are there any existing files without git history that face similar issues? > > [1] https://code.googlesource.com/git/summit/2020/+/main/notes.md > [2] https://lore.kernel.org/git/20210103211849.2691287-1-sandals@crustytoothpaste.net/ There was another proposal posted by brian last year, using signing keys the author controls instead of hashes: https://lore.kernel.org/git/20220919145231.48245-1-sandals@crustytoothpaste.net/T/ A different VCS, Pijul, recently adopted a system that seems similar to brian's proposal, and may provide some inspiration on the user experience. I haven't seen documentation for it, but there are some examples of commands here: https://nest.pijul.com/pijul/pijul/discussions/706 ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-04-26 18:43 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-04 18:00 Improving support for name changes in git Bran Hagger 2023-04-06 1:59 ` Junio C Hamano 2023-04-26 18:35 ` Gwyneth Morgan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox