03/15/21

AESD Lab Programming Best Practices

This document presents the set of C# best practices we maintain across all our projects at AESD Lab. Lets’ dive straight in.

We will use several designations:

πŸ‘/πŸ‘Ž Good practice / Bad practice
❔ Explainer
πŸ‘€ Pay attention

Learning advice: read books and listen to the opinions of others, but always exercise your own judgement. Any thoughts on Best Practice in software development are the product of an individual's own experience and beliefs. Try out the theory in practice. Don't be afraid to experiment, but make sure your teammates understand your approach before finding it production :)

Exception Handling

πŸ‘ Do always implement exception handling in the top level code of your application. This is a minimum requirement

πŸ‘€ Always wrap threads and timer callbacks with a try catch block
πŸ‘€ The .NET framework provides a low-level event AppDomain.UnhandledException for global exception handling. Always implement a handler for this event to be sure that any unexpected issue is logged, but note, preventing process termination cannot be guaranteed at this point.

  • Threads
    Exception handling here is critical. There is no way for global handling to prevent process termination:
    new Thread(() =>
    {
        while (!token.IsCancellationRequested)
        {
            try
            {
                // do heavy job
            }
            catch (Exception e)
            {
                _logger.Log(e);
            }
        }
    }).Start();
  • Threading.Timer
    Exception handling here is critical. There is no way for global handling to prevent process termination:
    new Timer(() =>
    {
        try
        {
            // do regular job
        }
        catch (Exception e)
        {
            _logger.Log(e);
        }
    }).Change(0, 100);
    πŸ‘€ we prefer to use System.Threading.Timer vs System.Timers.Timer. The latter appears is simply a wrapper of the former and does not offer any particular advantage.
    System.Threading.Timer does not suppress any unhandled exceptions by itself and requires us to at least log raised exceptions

πŸ‘ Do setup a global error handler for any unhandled C# exception.
πŸ‘€ Always log unhandled exceptions using the appropriate global error handler. The exact method will differ depending on the type of C# project:

  • ASP.NET WebApi for IIS classic pipeline:
    protected void Application_Start()
    {
        GlobalConfiguration.Configure(Register)
    }
    
    public static void Register(HttpConfiguration config)
    {
        config.Services.Add(typeof(IExceptionLogger), new MyExceptionLogger());
        // optionally for page based application to inject own error page
        config.Services.Replace(typeof(IExceptionHandler), new ErrorHandler());
    }
  • ASP.NET WebApi OWIN Middleware pipeline:
    public void Configuration(IAppBuilder builder)
    {
        // preceding middlewares are not wrapped
        builder.Use<UnhandledExceptionMiddleware>();
        // subsequent middlewares are wrapped
    }
  • ASP.NET WebForms:
    void Application_Error(object sender, EventArgs e)
    {
        Exception exc = Server.GetLastError();
        if (exc is HttpUnhandledException)
        {
            // access original exception in exc.InnerException
            Server.Transfer("MyErrorPage.aspx", preserveForm: true);
        }
    }
    
  • ASP.NET MVC (web.config) ErrorController is intended to handle redirects:
    πŸ‘€ there is also an advanced option with <system.webServer> section <httpErrors> tag
    <system.web>
      <customErrors mode="On" defaultRedirect="~/Error">
        <error redirect="~/Error/NotFound" statusCode="404" />
      </customErrors>
    </system.web>
  • ASP.NET MVC Application_Error handler is still applicable and has greater scope than the customErrors configuration, but limits to use Razor driven template files (so limited to *.aspx)
    void Application_Error(object sender, EventArgs e)
    {
        Exception exc = Server.GetLastError();
        // exc is the original exception
        Server.Transfer("MyErrorPage.aspx", preserveForm: true);
    }
  • ASP.NET Core (good post about error handling) the example handler here is from MVC boilerplate code which targets the shared Error.cshtml view. The handler itself already handles logging. There almost no difference in configuration for web API except you would likely have ErrorController (see here for details):
    public void Configure(IApplicationBuilder app)
    {
          // configure logging with built in services
          // preceding middlewares are not wrapped
    	app.UseExceptionHandler("/Home/Error");
          // subsequent middlewares are wrapped
          // MVC has almost no difference
    	app.UseMvc(routes => { });
    }

πŸ‘ Do await all tasks before being garbage collected

public async Task<int> CompositeWork(CancellationToken token)
{
    var tasks = new List<Task>(Collection.Count);

    foreach (object item in Collection)
        tasks.Add(Work(item, token));

    await Task.WhenAll(tasks);
    foreach (Task task in tasks)
    {
        await task;
        // for example some continuation
    }
}

❔ If execution of task t results in an exception and t is subsequently garbage collected without being awaited - the application will crash
❔ The resulting task object from Task.WhenAll will keep an AggregateException in case where at least one nested exception has been thrown
❔ Task.WhenAll can be replaced with an explicit try catch block around each task await in the loop, and all the nested thrown exceptions should be manually collected before being thrown outside

πŸ‘Ž Do not replace exceptions with special method return values or "error codes"

public Dictionary<int, Tag> LoadTags(IEnumerable<int> pocoModelIds)
{
    try
    {
        // e.g., it makes calls a dataBase to load tags
    }
	catch (Exception e)
	{
	    _logger.Log(e); // exception doesn't contain full stack trace
	    return null;
	}
}

❔ Forces the calling code to make redundant checks
❔ Potentially leads to a NullReferenceException in the calling code which has lost context
❔ Simply logging the exception here does not capture the full stack trace. Thus debugging through log analysis is insufficient. Debugging will require manual stack trace gathering in addition to log analysis.

πŸ‘Ž Do not leave empty catch blocks

try
{
    // some work
}
catch (Exception e)
{
}

❔ Unless dealing with APIs where it is necessary to test something before doing a throwing action - there is no good reason for an empty catch block. We may wish to suppress a β€œminor” exception in order to allow composite logic to continue without interruption. In this case the best practice is to, at least, log the occurrence of the exception. However in the general case of business/application logic it is considered bad practice to leave the catch block empty.

πŸ‘Ž Do not try/catch custom exceptions within iterations

❔ In iterative infrastructure code which implements a failure/retry policy: you should attempt to determine whether any underlying errors are transient or more serious. The code should only throw an exception for non-transient errors or when all retry attempts have been exhausted
❔ For regular collection-based algorithms, throwing an exception per iteration will lead to the worst possible performance due to lack of compiler optimization (possible for the structured control flow constructs) and additional costs of exception handling.
πŸ‘€ Consider a thread performing regular batch processing with an outer loop over a number of transactions. Any individual data transaction may throw an exception, however, the overall batch processing should be protected from failure. Hence the thread callback (transaction worker function) must implement exception handling with at least minimal logging.

Recursion

πŸ‘ Do limit recursion depth

public class Node
{
    private const int MaxDepth = 15;
    public List<Node> Childs { get; }

    public void Init()
    {
        InitDeep(1);
    }

    private void InitDeep(int depth)
    {
        if (depth > MaxDepth)
            throw new MyRecursionDepthExceededException();

        int deeperDepth = depth + 1;
        for (int i = 0; i < Childs.Count; i++)
            Childs[i].InitDeep(deeperDepth);
    }
}

❔ Unlimited recursion depth will eventually cause a fatal StackOverflowException. Here’s a real life example: a web application used by a number of law companies contained a feature implemented with recursion. One of the clients had a invalid cyclic reference in their data. As a result, the recursive function entered into an unbounded loop and crashed the entire web platform for all clients. The platform was unusable for tens of thousands of users for several hours while programmers found the issue. Had the algorithm implemented limited recursion depth, the error would have been contained to a single query and the platform would not have been brought down for all users.
❔ Remember that call stack is limited to 1Mb [4 MB for 64-bit processes], In the C# CLR the stack can accommodate thousands of calls, but if you are processing a very large recursive data structure using a recursive algorithm, this will eventually crash on processing the N-th element due to stack space limitation.
❔ Remember: C# can catch all exceptions, except StackOverflowException. StackOverflowException will crash the current thread and will terminate any program. Be very mindful of this if your program happens to be a business critical server process handling many thousands of concurrent client connections!
❔ Prefer converting regular tail-recursion to iteration
❔ Consider converting non-tail recursion to iteration. Refer to this article

Consistency

πŸ‘ Do have single DB transaction per single data modifying use case

public class Service 
{
    public void Create(PocoModel entity)
    {
        entity.AssertForCreate();
        if (entity.Tags != null)
            _tagsService.Put(entity.TenantId, entity.Tags);

        using (var scope = CreateTransactionScope())
        {
            PersistEntity(entity);
            if (entity.Tags != null)
                _tagsService.CreatePocoTags(entity.Id, entity.Tags);

            scope.Complete();
        }
    }
}

❔ Each data interaction with the system should translate it from one consistent state to the next. There should be no possibility for any use case to be interrupted before completion leaving data in an unexpected state.
❔ When some part of a transaction is idempotent, for example, adding tags in the above example, it can simply be moved before the start of the transaction to avoid holding unnecessary or redundant locks. Idempotency here means that the tag records are added once only for a given entity. Once added, a tag will never be changed. It follows that each attempt to add the tag to the tenant will either do nothing if was inserted before or insert a record if was not inserted before
❔ Overly complex transactions can become resource heavy. Be careful of too many locked resources or violation of expected locking order, which will lead to timeouts and/or deadlocks. To avoid this, such parts of the single transaction can be isolated and replaced with long-running-transactions. In simple cases, instead of making many updates to related resources in the same transaction, it is preferable to create a separate collection of "continuation" tasks holding minimal required data to continue related updates in a background process owned by the responsible feature
πŸ‘€ Idempotent calls can actually be placed under the transaction scope but still acting outside the transaction. It will require us to wrap persistence of related data with a new scope with TransactionScopeOption.Suppress option. This can be a preferred way to interact with the main transaction in case the business logic is decomposed into the features with some framework which might transparently apply transaction scopes by design

πŸ‘ Do assert arguments for public methods

❔ A way to prevent meaningless NullReferenceException in the logs and avoid extended chains of errors arising from the subsequent use of the values stored in a class state.
πŸ‘€ There is a balance between littering too many assertions across the code and ignoring their use completely. The rule of thumb is to make less assertions for two cohesive code units (less distance) but more assertions for loosely coupled units (greater distance). This rule is generally applicable to :

  • interaction between several classes (coupling)
  • forming a class's state which will be used by a different method later(invariance control)
  • providing a base class method to be called by a subtype method (tooling)

πŸ‘Ž Do not expect to find data models in a consistent state of after exceptions are thrown from a service.
Typically we refer to classes as services when they are public and designed for external interaction. A service should follow the same predictable approach adopted by REST APIs - i.e. thread-safe and stateless CRUD implementation (Create/Read/Update/Delete). This is important to observe if services are to be useful and versionable. That is why we use data models to interact with them. Service responsibilities can include preparing of the data state before being persisted. If an exception is thrown in the middle of this process some data members may be not prepared and may deviate from documented state. Be careful with usage of such data models which may lead to inconsistency and unexpected issues.

public class Service 
{
    public void Create(PocoModel entity)
    {
        entity.AssertForCreate(); // thrown in the middle
    }
}

public class SomeApplicationClass
{
    public void Process(Message message)
    {
        PocoModel entityToCreate = MapToEntity(message);
        try
        {
            _service.Create(entityToCreate);
        }
        catch (Exception e)
        {
            // can result NullReferenceException
            if (entityToCreate.AlwaysNonNullCollection.Count == 0)
                throw;

            _logger.Exception(e);
        }
    }
}

πŸ‘€ Where information on exceptions occurring inside a service needs to be communicated back to the calling application, such exceptions should be incorporated into model exceptions

Maintainability

πŸ‘ Do prefer to patch the data instead of updating it
The problem with update operation, performed with a data model, becomes evident when the model is round-tripped between client and server e.g. from REST to UI and back with updates. Consider the case where the client (service consumer) is not up-to-date with respect to the service provider. it can lead to data loss when the modified data is passed back without values for (as yet) unsupported model members. To avoid that, the update operation should recognize which data members are intended to be updated and which are not. The simplest way to identify whether the data member should participate is to treat one of possible domain values of its type as a non-initialized. The value can be used explicitly or just be omitted by the consuming side and it would make no difference for the implementing side.

❔ Simpler and safer versioning, i.e. no need for explicit version indication
❔ No risk of accidentally erasing part of the data by outdated calling logic
πŸ‘€ For value types in C# it is sometimes convenient to use Nullable <> to define the non-initialized state
πŸ‘€ Don't confuse non-initialized state with default value - both can take place the same time

πŸ‘ Do use static classes only for extension methods

❔ For public constants use appropriate business logic types. It is acceptable to define nested static classes for grouping of the constants.
❔ When implementing methods (independent of public infrastructure) which operate on collections or other kinds of containers (e.g. with items of business logic types), extension methods could be useful to extract trivial loops out from original code and improve its readability
❔ When implementing methods dependent on public infrastructure (e.g. a published web API) never use static methods, even if the underlying infrastructure is exposed as static. When the time comes to introduce DI it will cost additional time to transform the code to non-static.

If the appropriate framework for code organization has been already introduced - implement its abstraction and use DI (Dependency Injection). If not yet - use singletons and keep your classes as non-static until the framework is there

πŸ‘ Do separate application and business logic concerns
Classes responsible for data transaction modelling, assertion and manipulation are generally called business logic (infrastructure independent logic). Classes responsible for exposing APIs, running background processing, building GUI, scaling in contrast are called application logic. Always implement them in separate assemblies and make sure that application logic does not appear inline or duplicate responsibilities from business logic.

❔ It is ok to have repetition of data assertion logic in a GUI (Graphical User Interface) app that relies on a documented APIs
❔ System architecture is always subject to revision and change. This is especially true for early stage products and startups.
❔ It is important to keep business logic isolated for enterprise systems from the very beginning because they tend to always grow out the boundaries of a single application and will require business logic reuse and prolongation (long-running-transactions) often among several apps within the same system component.

πŸ‘ Do declare meaningful exceptions for data assertion

// app1
try
{
    _entityService.Create(...);
}
catch (EntityException e)
{
    return Response(e.Message);
}
catch (Exception e)
{
    _logger.Exception(e);
    return Response("Unknown error occurred");
}

// app2
try
{
    _entityService.Create(...);
}
catch (ParticipatingFeatureEntityException e)
{
    return Response(e.SomeFeatureSpecificPayload);
}
catch (Exception e)
{
    _logger.Exception(e);
    return Response(null);
}

❔ Simplifies adaptation of business logic to on-going application needs
❔ The response (type) can be extended with additional flags or subclassed to prepare feature specific payloads
❔ Always contains an exposable assertion message. And can be additionally extended with some strategies to modify or emphasize some parts of the initial message.
❔ Doesn't mess up with regular return values as would be the case with error codes

πŸ‘ Do use ambient transactions instead of provider specific

var options = new TransactionOptions
{
    IsolationLevel = IsolationLevel.ReadCommitted,
    Timeout = timeout
}
var scopeOption = TransactionScopeOption.Required;
using (var scope = new TransactionScope(scopeOption, options))
{
    PersistEntity(entity);
    if (entity.Tags != null)
        _tagsService.CreatePocoTags(entity.Id, entity.Tags);

    scope.Complete();
}

❔ Reduces code size, removes the need to pass a transaction object everywhere
❔ Best fit for use with exceptions, i.e. is automatically rolled back if the code didn't reach the Complete call
❔ Allows us to enlist into transaction management, e.g. to continue some minor actions on successful commit outside of just committed transaction boundaries
πŸ‘€ When TransactionScope API is used directly, make sure that at least creation of the scope is isolated in a single place and injected to all the use cases. This will reduce redundant copy & pasted code for the scope initialization and will eliminate the risk of violating the isolation level rule: the IsolationLevel for nested scopes must be set equal to the parent
πŸ‘€ Internally TransactionScope API makes use of ThreadLocal and AsyncLocal depending on one more parameter (asyncFlowOption). In fact the core of ambience relies within these synchronization models so it is not limited to only TransactionScope to be "ambient". But beware of everyday direct use of these APIs - it can greatly complicate your code. The recipe of using ambience techniques safely is use of well known and/or very well defined/documented APIs

πŸ‘ Do always start transactions from the same "base" table/collection
It applies to the OLTP (OnLine Transactional Processing) part of an enterprise system. For each given entity there is always one table treated as a "base" because it has been added first to denote existence of the entity within a given system component. Each time an additional related table is added for the purpose of some new feature, it must be determined whether it will be "stable" or "unstable" related to the "base". Interaction with "unstable" collections, which have unlimited number of related records, shall be extracted into a separate entry point issuing its separate transaction. This then becomes another "base" collection. But for "stable" collections or even for 1-1 related records, interaction shall be placed within the original transaction. In such a case the modifying transaction should retain the initial locks taken on the base table first even when an actual use case doesn't need to change any part of the base record. These locks act like a mutex, so no change to a stable related part can be made partially.

❔ To avoid deadlocks, most frequently occurring between independently participating features within a transaction
πŸ‘€ For OLAP (OnLine Analytical Processing) part of an enterprise system, specifically for the need to synchronize aggregated values into an OLTP database, there is no need to make it a part of modelled transactions. The reason for this is probable high frequency of such synchronization which by nature would make too big contention to user interaction and long-running-transactions processing. Typically such aggregated values participate only in informational form and are delivered readonly through the same OLTP models. One of the major reasons for synchronizing such values - to allow efficient search by criterias. If there is a need for business logic to react to periodical change of the aggregated values - consider listening for corresponding events coming from OLAP processing.

πŸ‘ Do persist / modify each part of the data model from the single place in code
It is bad design to have more than one class responsible for the persistence of any property of the data model.

πŸ‘€ Follow the above rule and implement single class responsibility persisting logical parts of the data model. Your approach or framework should not require or encourage redundant duplication of persistence logic

πŸ‘Ž Do not declare an interfaces for each and every class/service

❔ An interface is an abstraction not a header file.
❔ Abstraction is needed to unify interaction with a set of code elements (polymorphism) or to organize boundaries between components but not for inflating the code nor for declaring "headers"
❔ Do not believe that your services is any kind of polymorphic. Do not believe you will ever proxy your service via such "header" interface
❔ Abstract types must never be changed
πŸ‘€ In times of WCF (Windows Communication Foundation) domination there was the need to extract "service" interfaces to be able to consume the service endpoint from the client side. This arch was making people to believe that the framework suggests them to organize their business logic in this way always, producing very tight coupling to itself and spreading the disease of the whole generation to break every normal transaction entry point into separate RPC-style (Remote Procedure Call) services with the interfaces being always extracted and maintained per each one. After the followers have understood that this arch leads to eventual loss of transactivity and it was still not scalable as they wanted it to be, the approach was replaced with REST. The latter do motivate to pay more attention to the modelled state rather than the use cases/behaviour. REST has teached the next generation that all desired data interaction can be unified via the well known and limited set of methods, so it became not needed to declare a ton of "headers" at all, but rather (far not in every project though) to establish a framework, declaring a set of generic abstractions for resource/transaction serving roles/attach points. With use of such frameworks in a project component it becomes unnecessary even for testing/mocking to declare any custom interfaces

πŸ‘Ž Do not change the set of public enum's values

public enum CampaignType { ... }

public abstract class CampaignType
{
    public Guid Id { get; }
    public string Name { get; }

    public CampaignType(string id, string name) : this(...)
    public CampaignType(Guid id, string name) { ... }
}

public sealed class PushCampaignType : CampaignType
{
    public PushCampaignType() : base(
        "bb4defdf-4bea-4b46-98ee-fe1954ecac62",
        "Push") { }
}

❔ For enums indicating different states of an object there are always at least one initial state and one terminal state. Any state dependent logic will very likely rely on these initial and terminal states as well as all the paths connecting them. Extending the path may well break the state machine logic or the behaviour expected by consumers of the class.
πŸ‘€ If you still need to extend the set of states then before making this heavy breaking change, consider adding an additional enumerated type to capture substates
❔ For enums indicating some variety of an object or behavior, observe this rule: always serialize it in a string form, consider to replacing enum declarations with a collection of string constants. Then, in the relevant base class, extract an abstract value object (string) for the type and assign all the variants to that member in the derived classes.. This way you will still keep track of all variants and references, and you will avoid git conflicts in a huge coupling list of all variants that would take place otherwise

πŸ‘Ž Do not declare non-private methods having more than 3 arguments

public class Service 
{
    public void CreatePocoModel(
        int tenantId,
        	List campaigns,
        	PocoModelFeatureState featureState,
        	List tags)
    { ... }
}

❔ Assume the number of arguments will tend to grow. Each time the signature changes you need both resolve merge issues as well as update all calling code
❔ In many cases the additional argument(s) being added are optional. That being the case, they can be extracted into an optional structure at design time (especially for libs). For services this is definitely best practice, and as was mentioned earlier, for better versioning they should operate with well defined data models instead of inlined arguments
❔ The same rule applies to Public constructors unless a facade pattern is responsible for class initialization. In the latter case, constructors should be declared internal.

πŸ‘Ž Do not make async use cases declared with a business logic to assume preferences for a synchronization context
While there is even a rule in some code analysis tools like in Roslynator to suggest explicit use of ConfigureAwait, following this suggestion reduces core readability and has no meaning at business level. It is only specific for some GUI and other frameworks which enforce single threaded event processing. Business logic should never be making any direct calls to any application related resources so this is always possible to make required adaptation in the app rather to inflate business logic with not convenient payload

❔ Business logic should not care about what app limitations are involved. For example that there has to be only one thread interacting with a form elements
❔ If there is a need in the application to organize interaction with its components while business logic execution, it can be configured with some application level callbacks passed down the layer in form of strategies. This way all needed application framework adaptation will be made within the callbacks
❔ Generally speaking this rule means to not do any assumption in the code that have no explicit dependencies to requiring framework. This is something that can indirectly refer to the definition of an Informational Expert. So it is not limited to only enterprise application code but has wider appliances, including libraries.

Data access performance

πŸ‘ Do prefer logical (soft) deletion over physical for domain entities data
Instead of actual deletion of a table row, add a bit field into the table like "IsActive" or "IsDeleted"

❔ Often when we delete a persistent entity this involves deletion of far more than a single record from one table. For a system undergoing development this complexity will grow correspondingly. The preferred approach is to mark the entity record as logically deleted for numerous reasons, not least to keep UX (User eXperience) and performance under control
❔ Often when you need to delete a parent (entity) record, there are related (value/history) records which should be retained. When you soft delete the parent record, it stays in the last consistent state and there is no issue with primary key violation. Related records still have a valid primary key which gives us the ability to recover the data later
❔ When it becomes necessary to cleanup old, unwanted data, we prefer to use a separate background process that runs periodically and accurately, taking as little DB resource as possible. This kind of regular database housekeeping function should be scheduled in an established maintenance window along with, for example, scheduled index rebuilds. The latter often involves shrinking the database to reduce actual file size and optimize physical structure
❔ Keeping data records in place for some time (soft deletion) before being physically deleted, has additional benefits - it will save you time if customers ask to quickly recover data even if the feature for this was not yet implemented.

πŸ‘ Do paginate entity retrieving API methods from the beginning
When an entity collection is going to be exposed into a public API it must be introduced as paginated with limited page size from the beginning. If not followed and there is at least one consumer appeared then there is no chance already to introduce pagination or add page size limitation, otherwise it would be a breaking change and would require major API update and all corresponding administering tasks to be involved.

❔ Having exposed collections without pagination doesn't provide the control over the system resources consumed by such endpoint
πŸ‘€ Even if the data collection is not publicly exposed via an API, it may become so at some point in the future. Therefore assume this will be the case and prefer to implement paged access for any consuming code within your current system. Your data layer will thus be future-proofed and less of a target for refactoring
πŸ‘€ Do not provide access to a collection with always hardcoded "page size"/"top". Doing this means that you want to temporarily treat this part of the collection as stable for some part of consuming code and that you probably already understand that in fact the collection is non-stable. When time comes it will be considered as a non-stable and will be exposed to the API, and you will have to update all your dependent code and/or maintain redundant code until the end of days (major update)

πŸ‘Ž Do not perform additional DB queries per each result from already active result set
In simple terms: do not execute SQL queries in a loop. Create a single query, which will return required data.

❔ With MARS (Multiple Active Result Sets) enabled for one reused connection it still won't give any significant performance boost
❔ Without MARS enabled for one reused connection it will fail
❔ Without MARS enabled otherwise, each call would consume too many connections, that would lead to performance degradation for the given call as well as for other simultaneous calls contending for the connections pool
πŸ‘€ Prefer to load additional data in collection style for all results in one separate query, then match to the initial result set

Git

πŸ‘ Do keep a similar line ending style

❔ When not controlled, automatic formatting tools can arbitrarily replace an existing line ending style with their own default. This can start to do more damage than good.
πŸ‘€ Line ending style must be enforced at some stage. It is possible to impose normalization using the (shared) git client configuration file. If this isn’t enforced you will end up with a repo full of mixed line endings
πŸ‘€ Implement a policy for fixing formatting issues in legacy components. Say if it should be formatted before or after the main changes, in the same branch or in another. As a general rule of thumb, avoid mixing file formatting with core functionality changes, unless the file has changed by more than 70%

πŸ‘ Do clean obsolete code right away - git remembers everything

❔ YAGNI
πŸ‘€ Git supports tags. The most common use case for which is tracking important milestones in project development like releases - crucial for shared libraries. On the other hand, for applications, the most important milestone is a feature. While possibly combined with release tagging, features can also be tagged at the instant they have been merged. Having tags on features allows for a great experience in git history navigation and removes any reasoning to keep obsolete code
❔ Another option to track a feature lifecycle might be manually building the change log being indexed by git as well. By looking at the line history it is also easy to find the moment of feature deployment, obsoleting, removing.
πŸ‘€ Both approaches, tags and change logs, do require continuous management. Alternatively leaving tags just for deleted features may casually help to get rid of ballast
πŸ‘€ Leaving unused/obsolete features in the head of trunk/branch negatively impacts the maintenance time: it increases the volume of code to maintain and potentially confuses developers who might think the feature is still in use and needs maintenance.

πŸ‘ Do review and split the set of code changes to small cohesive commits
This is an important quality assurance tactic as well as self-discipline. An often misunderstood approach leads developers to either make very large commits or, conversely, try to keep their commits as small as possible throughout development. The latter approach is only made possible by always applying an β€œbottom-up design” that is more prone to design errors. For this reason we specifically do not recommend this approach. Rather, we encourage developers to practice self-review. The approach should not interfere with the design and development process which depends on the developer's awareness about the changed subsystem, patterns he will use and many other factors. The goal is to make the feature first, but then take time to review the changes as a quality assurance step. Here is a good iterative practice/algorithm for decomposition of changes into a smaller meaningful set of commits after the work on feature or its major part is done:

  1. Look at the first change you would want to start with and stage that change
  2. Formulate why you had to change this file/class/method in the commit text - it should be meaningful, concise and represent the set of changes which can be pulled retaining the ability to build and run the code.
  3. Look to the next change
  4. If there is no more changes or you recognize nor of them will be related to the current formulated reason - commit the changes, continue from 8
  5. If the change is made for the same formulated reason - add to the stage
  6. If the change is prerequisite for current formulater reason - remove current commit text, unstage changes, continue from 2
  7. If the change is made for another non-prerequisite reason - skip it for now, go to 3
  8. Check the rest of changes
  9. If there is still some changes - continue from 1

❔ This is important to keep commits small and cohesive while keeping the code runnable from commit to commit for building solid changes history and simplifying the work of other reviewers
❔ Usefulness of code review is itself a big theme but the most important outcome is knowledge sharing and facilitating the process of problem detection
πŸ‘€ Most of git clients and several IDE's integrated ones are able to stage only selected lines from a file when viewed in diff mode