重构

2013-03-21

###征兆

庞大臃肿的类
长方法
大量传参
常量四处乱窜
方法名模糊

###方法

提取类抽离方法
提取方法
分离条件
引入参数对象/保留全局对象
用符号常量替换魔法数字
重命名方法

###设计的基本原则

程序应用接口和扩展接口分离
服务域/实体域/会话域分离
在重要的过程上设置拦截接口
重要的状态的变更发送事件并留出监听接口
扩展接口职责尽可能单一，具有可组合性
微核插件式
不要控制外部对象的生命周期
可配置一定可编程
区分命令与查询，明确前置条件与后置条件
增量式扩展，而不要扩充原始核心概念

以下转载《The Art of Readable Code》的读书笔记，再加一点自己的认识。推荐此书：

英文版：《The Art of Readable Code》
中文版：编写可读代码的艺术

##代码为什么要易于理解

“Code should be written to minimize the time it would take for someone else to understand it.”

日常工作的事实是：

写代码前的思考和看代码的时间远大于真正写的时间
读代码是很平常的事情，不论是别人的，还是自己的，半年前写的可认为是别人的代码
代码可读性高，很快就可以理解程序的逻辑，进入工作状态
行数少的代码不一定就容易理解
代码的可读性与程序的效率、架构、易于测试一点也不冲突

整本书都围绕“如何让代码的可读性更高”这个目标来写。这也是好代码的重要标准之一。

##如何命名

###变量名中应包含更多信息

####使用含义明确的词，比如用download而不是get，参考以下替换方案：

 send -> deliver, dispatch, announce, distribute, route
 find -> search, extract, locate, recover
start -> lanuch, create, begin, open
 make -> create,set up, build, generate, compose, add, new

####避免通用的词像tmp和retval这样词，除了说明是临时变量和返回值之外，没有任何意义。但是给他加一些有意义的词，就会很明确：

tmp_file = tempfile.NamedTemporaryFile() 
...
SaveData(tmp_file, ...)

不使用retval而使用变量真正代表的意义：

sum_squares += v[i]; // Where's the "square" that we're summing? Bug!

嵌套的for循环中,i、j也有同样让人困惑的时候：

for (int i = 0; i < clubs.size(); i++)
    for (int j = 0; j < clubs[i].members.size(); j++)
        for (int k = 0; k < users.size(); k++) if (clubs[i].members[k] == users[j])
            cout << "user[" << j << "] is in club[" << i << "]" << endl;

换一种写法就会清晰很多：

 if (clubs[ci].members[mi] == users[ui])  # OK. First letters match.

所以，当使用一些通用的词，要有充分的理由才可以。

####使用具体的名字 CanListenOnPort就比ServerCanStart好，can start比较含糊，而listen on port确切的说明了这个方法将要做什么。

--run_locally就不如--extra_logging来的明确。

####增加重要的细节，比如变量的单位_ms，对原始字符串加_raw 如果一个变量很重要，那么在名字上多加一些额外的字就会更加易读，比如将string id; // Example: "af84ef845cd8"换成string hex_id;。

             Start(int delay)  -->  delay → delay_secs
        CreateCache(int size)  -->  size → size_mb
ThrottleDownload(float limit)  -->  limit → max_kbps
          Rotate(float angle)  -->  angle → degrees_cw

更多例子：

password  ->  plaintext_password
 comment  ->  unescaped_comment
    html  ->  html_utf8
    data  ->  data_urlenc

####对于作用域大的变量使用较长的名字在比较小的作用域内，可以使用较短的变量名，在较大的作用域内使用的变量，最好用长一点的名字，编辑器的自动补全都可以很好的减少键盘输入。对于一些缩写前缀，尽量选择众所周知的(如str)，一个判断标准是，当新成员加入时，是否可以无需他人帮助而明白前缀代表什么。

####合理使用_、-等符号，比如对私有变量加_前缀。

var x = new DatePicker(); // DatePicker() 是类的"构造"函数，大写开始
var y = pageHeight(); // pageHeight() 是一个普通函数

var $all_images = $("img"); // $all_images 是jQuery对象
var height = 250; // height不是

//id和class的写法分开
<div id="middle_column" class="main-content"> ...

###命名不能有歧义命名的时候可以先想一下，我要用的这个词是否有别的含义。举个例子：

results = Database.all_objects.filter("year <= 2011")

现在的结果到底是包含2011年之前的呢还是不包含呢？

####使用min、max代替limit

CART_TOO_BIG_LIMIT = 10
    if shopping_cart.num_items() >= CART_TOO_BIG_LIMIT:
        Error("Too many items in cart.")

MAX_ITEMS_IN_CART = 10
    if shopping_cart.num_items() > MAX_ITEMS_IN_CART:
     Error("Too many items in cart.")

对比上例中CART_TOO_BIG_LIMIT和MAX_ITEMS_IN_CART，想想哪个更好呢？

####使用first和last来表示闭区间

print integer_range(start=2, stop=4)
# Does this print [2,3] or [2,3,4] (or something else)?

set.PrintKeys(first="Bart", last="Maggie")

first和last含义明确，适宜表示闭区间。

####使用beigin和end表示前闭后开([2,9))区间

PrintEventsInRange("OCT 16 12:00am", "OCT 17 12:00am")

PrintEventsInRange("OCT 16 12:00am", "OCT 16 11:59:59.9999pm")

上面一种写法就比下面的舒服多了。

####Boolean型变量命名

bool read_password = true;

这是一个很危险的命名，到底是需要读取密码呢，还是密码已经被读取呢，不知道，所以这个变量可以使用user_is_authenticated代替。通常，给Boolean型变量添加is、has、can、should可以让含义更清晰，比如：

             SpaceLeft()  -->  hasSpaceLeft()
bool disable_ssl = false  -->  bool use_ssl = true

####符合预期

public class StatisticsCollector {
    public void addSample(double x) { ... }
    public double getMean() {
        // Iterate through all samples and return total / num_samples
    }
    ...
}

在这个例子中，getMean方法遍历了所有的样本，返回总额，所以并不是普通意义上轻量的get方法，所以应该取名computeMean比较合适。

##漂亮的格式写出来漂亮的格式，充满美感，读起来自然也会舒服很多，对比下面两个例子：

class StatsKeeper {
   public:
   // A class for keeping track of a series of doubles
      void Add(double d);  // and methods for quick statistics about them
     private:   int count;        /* how many so    far
   */ public:
           double Average();
   private:   double minimum;
   list<double>
     past_items
         ;double maximum;
};

什么是充满美感的呢：

// A class for keeping track of a series of doubles
// and methods for quick statistics about them.
class StatsKeeper {
  public:
    void Add(double d);
    double Average();
  private:
    list<double> past_items;
    int count;  // how many so far
    double minimum;
    double maximum;
};

###考虑断行的连续性和简洁这段代码需要断行，来满足不超过一行80个字符的要求，参数也需要注释说明：

public class PerformanceTester {
    public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator(
        500, /* Kbps */
        80, /* millisecs latency */
        200, /* jitter */
        1 /* packet loss % */);

    public static final TcpConnectionSimulator t3_fiber = new TcpConnectionSimulator(
        45000, /* Kbps */
        10, /* millisecs latency */
        0, /* jitter */
        0 /* packet loss % */);

    public static final TcpConnectionSimulator cell = new TcpConnectionSimulator(
        100, /* Kbps */
        400, /* millisecs latency */
        250, /* jitter */
        5 /* packet loss % */);
}

考虑到代码的连贯性，先优化成这样：

public class PerformanceTester {
    public static final TcpConnectionSimulator wifi =
        new TcpConnectionSimulator(
            500, /* Kbps */
            80, /* millisecs latency */ 200, /* jitter */
            1 /* packet loss % */);

    public static final TcpConnectionSimulator t3_fiber =
        new TcpConnectionSimulator(
            45000, /* Kbps */
            10,    /* millisecs latency */
            0,     /* jitter */
            0      /* packet loss % */);

    public static final TcpConnectionSimulator cell =
        new TcpConnectionSimulator(
            100,   /* Kbps */
            400,   /* millisecs latency */
            250,   /* jitter */
            5      /* packet loss % */);
}

连贯性好一点，但还是太罗嗦，额外占用很多空间：

public class PerformanceTester {
    // TcpConnectionSimulator(throughput, latency, jitter, packet_loss)
    //                            [Kbps]   [ms]    [ms]    [percent]
    public static final TcpConnectionSimulator wifi =
        new TcpConnectionSimulator(500,    80,     200,     1);

    public static final TcpConnectionSimulator t3_fiber =
        new TcpConnectionSimulator(45000,  10,     0,       0);

    public static final TcpConnectionSimulator cell =
        new TcpConnectionSimulator(100,    400,    250,     5);
}

###用函数封装

// Turn a partial_name like "Doug Adams" into "Mr. Douglas Adams".
// If not possible, 'error' is filled with an explanation.
string ExpandFullName(DatabaseConnection dc, string partial_name, string* error);

DatabaseConnection database_connection;
string error;
assert(ExpandFullName(database_connection, "Doug Adams", &error)
        == "Mr. Douglas Adams");
assert(error == "");
assert(ExpandFullName(database_connection, " Jake Brown ", &error)
        == "Mr. Jacob Brown III");
assert(error == "");
assert(ExpandFullName(database_connection, "No Such Guy", &error) == "");
assert(error == "no match found");
assert(ExpandFullName(database_connection, "John", &error) == "");
assert(error == "more than one result");

上面这段代码看起来很脏乱，很多重复性的东西，可以用函数封装：

CheckFullName("Doug Adams", "Mr. Douglas Adams", "");
CheckFullName(" Jake Brown ", "Mr. Jake Brown III", "");
CheckFullName("No Such Guy", "", "no match found");
CheckFullName("John", "", "more than one result");

void CheckFullName(string partial_name,
                   string expected_full_name,
                   string expected_error) {
    // database_connection is now a class member
    string error;
    string full_name = ExpandFullName(database_connection, partial_name, &error);
    assert(error == expected_error);
    assert(full_name == expected_full_name);
}

###列对齐列对齐可以让代码段看起来更舒适：

CheckFullName("Doug Adams"   , "Mr. Douglas Adams" , "");
CheckFullName(" Jake  Brown ", "Mr. Jake Brown III", "");
CheckFullName("No Such Guy"  , ""                  , "no match found");
CheckFullName("John"         , ""                  , "more than one result");

commands[] = {
    ...
    { "timeout"      , NULL              , cmd_spec_timeout},
    { "timestamping" , &opt.timestamping , cmd_boolean},
    { "tries"        , &opt.ntry         , cmd_number_inf},
    { "useproxy"     , &opt.use_proxy    , cmd_boolean},
    { "useragent"    , NULL              , cmd_spec_useragent},
    ...
};

###代码用块区分

class FrontendServer {
    public:
        FrontendServer();
        void ViewProfile(HttpRequest* request);
        void OpenDatabase(string location, string user);
        void SaveProfile(HttpRequest* request);
        string ExtractQueryParam(HttpRequest* request, string param);
        void ReplyOK(HttpRequest* request, string html);
        void FindFriends(HttpRequest* request);
        void ReplyNotFound(HttpRequest* request, string error);
        void CloseDatabase(string location);
        ~FrontendServer();
};

上面这一段虽然能看，不过还有优化空间：

class FrontendServer {
    public:
        FrontendServer();
        ~FrontendServer();
        // Handlers
        void ViewProfile(HttpRequest* request);
        void SaveProfile(HttpRequest* request);
        void FindFriends(HttpRequest* request);

        // Request/Reply Utilities
        string ExtractQueryParam(HttpRequest* request, string param);
        void ReplyOK(HttpRequest* request, string html);
        void ReplyNotFound(HttpRequest* request, string error);

        // Database Helpers
        void OpenDatabase(string location, string user);
        void CloseDatabase(string location);
};

再来看一段代码：

# Import the user's email contacts, and match them to users in our system.
# Then display a list of those users that he/she isn't already friends with.
def suggest_new_friends(user, email_password):
    friends = user.friends()
    friend_emails = set(f.email for f in friends)
    contacts = import_contacts(user.email, email_password)
    contact_emails = set(c.email for c in contacts)
    non_friend_emails = contact_emails - friend_emails
    suggested_friends = User.objects.select(email__in=non_friend_emails)
    display['user'] = user
    display['friends'] = friends
    display['suggested_friends'] = suggested_friends
    return render("suggested_friends.html", display)

全都混在一起，视觉压力相当大，按功能化块：

def suggest_new_friends(user, email_password):
    # Get the user's friends' email addresses.
    friends = user.friends()
    friend_emails = set(f.email for f in friends)

    # Import all email addresses from this user's email account.
    contacts = import_contacts(user.email, email_password)
    contact_emails = set(c.email for c in contacts)

    # Find matching users that they aren't already friends with.
    non_friend_emails = contact_emails - friend_emails
    suggested_friends = User.objects.select(email__in=non_friend_emails)

    # Display these lists on the page. display['user'] = user
    display['friends'] = friends
    display['suggested_friends'] = suggested_friends

    return render("suggested_friends.html", display)

让代码看起来更舒服，需要在写的过程中多注意，培养一些好的习惯，尤其当团队合作的时候，代码风格比如大括号的位置并没有对错，但是不遵循团队规范那就是错的。

##如何写注释当你写代码的时候，你会思考很多，但是最终呈现给读者的就只剩代码本身了，额外的信息丢失了，所以注释的目的就是让读者了解更多的信息。

###应该注释什么 ####不应该注释什么这样的注释毫无价值：

// The class definition for Account
class Account {
    public:
        // Constructor
        Account();
        // Set the profit member to a new value
        void SetProfit(double profit);
        // Return the profit from this Account
        double GetProfit();
};

####不要像下面这样为了注释而注释：

// Find a Node with the given 'name' or return NULL.
// If depth <= 0, only 'subtree' is inspected.
// If depth == N, only 'subtree' and N levels below are inspected.
Node* FindNodeInSubtree(Node* subtree, string name, int depth);

####不要给烂取名注释

// Enforce limits on the Reply as stated in the Request,
// such as the number of items returned, or total byte size, etc. 
void CleanReply(Request request, Reply reply);

注释的大部分都在解释clean是什么意思，那不如换个正确的名字：

// Make sure 'reply' meets the count/byte/etc. limits from the 'request' 
void EnforceLimitsFromRequest(Request request, Reply reply);

####记录你的想法我们讨论了不该注释什么，那么应该注释什么呢？注释应该记录你思考代码怎么写的结果，比如像下面这些：

// Surprisingly, a binary tree was 40% faster than a hash table for this data.
// The cost of computing a hash was more than the left/right comparisons.

// This heuristic might miss a few words. That's OK; solving this 100% is hard.

// This class is getting messy. Maybe we should create a 'ResourceNode' subclass to
// help organize things.

也可以用来记录流程和常量：

// TODO: use a faster algorithm
// TODO(dustin): handle other image formats besides JPEG

NUM_THREADS = 8 # as long as it's >= 2 * num_processors, that's good enough.

// Impose a reasonable limit - no human can read that much anyway.
const int MAX_RSS_SUBSCRIPTIONS = 1000;

可用的词有：

TODO  : Stuff I haven't gotten around to yet
FIXME : Known-broken code here
HACK  : Adimittedly inelegant solution to a problem
XXX   : Danger! Major problem here

####站在读者的角度去思考当别人读你的代码时，让他们产生疑问的部分，就是你应该注释的地方。

struct Recorder {
    vector<float> data;
    ...
    void Clear() {
        vector<float>().swap(data); // Huh? Why not just data.clear()? 
    }
};

很多C++的程序员啊看到这里，可能会想为什么不用data.clear()来代替vector.swap，所以那个地方应该加上注释：

// Force vector to relinquish its memory (look up "STL swap trick")
vector<float>().swap(data);

####说明可能陷阱你在写代码的过程中，可能用到一些hack，或者有其他需要读代码的人知道的陷阱，这时候就应该注释：

void SendEmail(string to, string subject, string body);

而实际上这个发送邮件的函数是调用别的服务，有超时设置，所以需要注释：

// Calls an external service to deliver email.  (Times out after 1 minute.)
void SendEmail(string to, string subject, string body);

####全景的注释有时候为了更清楚说明，需要给整个文件加注释，让读者有个总体的概念：

// This file contains helper functions that provide a more convenient interface to our
// file system. It handles file permissions and other nitty-gritty details.

####总结性的注释即使是在函数内部，也可以有类似文件注释那样的说明注释：

# Find all the items that customers purchased for themselves.
for customer_id in all_customers:
    for sale in all_sales[customer_id].sales:
        if sale.recipient == customer_id:
            ... 或者按照函数的步进，写一些注释：

def GenerateUserReport():
    # Acquire a lock for this user
    ...
    # Read user's info from the database
    ...
    # Write info to a file
    ...
    # Release the lock for this user

很多人不愿意写注释，确实，要写好注释也不是一件简单的事情，也可以在文件专门的地方，留个写注释的区域，可以写下你任何想说的东西。

###注释应简明准确前一个小节讨论了注释应该写什么，这一节来讨论应该怎么写，因为注释很重要，所以要写的精确，注释也占据屏幕空间，所以要简洁。

####精简注释

// The int is the CategoryType.
// The first float in the inner pair is the 'score',
// the second is the 'weight'.
typedef hash_map<int, pair<float, float> > ScoreMap;

这样写太罗嗦了，尽量精简压缩成这样：

// CategoryType -> (score, weight)
typedef hash_map<int, pair<float, float> > ScoreMap;

####避免有歧义的代词

// Insert the data into the cache, but check if it's too big first.

这里的it's有歧义，不知道所指的是data还是cache，改成如下：

// Insert the data into the cache, but check if the data is too big first.

还有更好的解决办法，这里的it就有明确所指：

// If the data is small enough, insert it into the cache.

####语句要精简准确

# Depending on whether we've already crawled this URL before, give it a different priority.

这句话理解起来太费劲，改成如下就好理解很多：

# Give higher priority to URLs we've never crawled before.

####精确描述函数的目的

// Return the number of lines in this file.
int CountLines(string filename) { ... }

这样的一个函数，用起来可能会一头雾水，因为他可以有很多歧义：

”” 一个空文件，是0行还是1行？
“hello” 只有一行，那么返回值是0还是1？
“hello\n” 这种情况返回1还是2？
“hello\n world” 返回1还是2？
“hello\n\r cruel\n world\r” 返回2、3、4哪一个呢？

所以注释应该这样写：

// Count how many newline bytes ('\n') are in the file.
int CountLines(string filename) { ... }

####用实例说明边界情况

// Rearrange 'v' so that elements < pivot come before those >= pivot;
// Then return the largest 'i' for which v[i] < pivot (or -1 if none are < pivot)
int Partition(vector<int>* v, int pivot);

这个描述很精确，但是如果再加入一个例子，就更好了：

// ...
// Example: Partition([8 5 9 8 2], 8) might result in [5 2 | 8 9 8] and return 1
int Partition(vector<int>* v, int pivot);

####说明你的代码的真正目的

void DisplayProducts(list<Product> products) {
    products.sort(CompareProductByPrice);
    // Iterate through the list in reverse order
    for (list<Product>::reverse_iterator it = products.rbegin(); it != products.rend();
            ++it)
        DisplayPrice(it->price);
    ... 
}

这里的注释说明了倒序排列，单还不够准确，应该改成这样：

// Display each price, from highest to lowest
for (list<Product>::reverse_iterator it = products.rbegin(); ... )

####函数调用时的注释看见这样的一个函数调用，肯定会一头雾水：

Connect(10, false);

如果加上这样的注释，读起来就清楚多了：

def Connect(timeout, use_encryption):  ...

# Call the function using named parameters
Connect(timeout = 10, use_encryption = False)

####使用信息含量丰富的词

// This class contains a number of members that store the same information as in the
// database, but are stored here for speed. When this class is read from later, those
// members are checked first to see if they exist, and if so are returned; otherwise the
// database is read from and that data stored in those fields for next time.

上面这一大段注释，解释的很清楚，如果换一个词来代替，也不会有什么疑惑：

// This class acts as a caching layer to the database.

##简化循环和逻辑

###流程控制要简单让条件语句、循环以及其他控制流程的代码尽可能自然，让读者在阅读过程中不需要停顿思考或者在回头查找，是这一节的目的。

####条件语句中参数的位置对比下面两种条件的写法：

if (length >= 10)
while (bytes_received < bytes_expected)

if (10 <= length)
while (bytes_expected > bytes_received)

到底是应该按照大于小于的顺序来呢，还是有其他的准则？是的，应该按照参数的意义来

运算符左边：通常是需要被检查的变量，也就是会经常变化的
运算符右边：通常是被比对的样本，一定程度上的常量

这就解释了为什么bytes_received < bytes_expected比反过来更好理解。

####if/else的顺序通常，if/else的顺序你可以自由选择，下面这两种都可以：

if (a == b) {
    // Case One ...
} else {
    // Case Two ...
}

if (a != b) {
    // Case Two ...
} else {
    // Case One ...
}

或许对此你也没有仔细斟酌过，但在有些时候，一种顺序确实好过另一种：

正向的逻辑在前，比如if(debug)就比if(!debug)好
简单逻辑的在前，这样if和else就可以在一个屏幕显示
有趣、清晰的逻辑在前

举个例子来看：

if (!url.HasQueryParameter("expand_all")) {
    response.Render(items);
    ...
} else {
    for (int i = 0; i < items.size(); i++) {
        items[i].Expand();
    }
    ... 
}

看到if你首先想到的是expand_all，就好像告诉你“不要想大象”，你会忍不住去想它，所以产生了一点点迷惑，最好写成：

if (url.HasQueryParameter("expand_all")) {
    for (int i = 0; i < items.size(); i++) {
        items[i].Expand();
    }
    ... 
} else {
    response.Render(items);
    ... 
}

####三目运算符(?:)

time_str += (hour >= 12) ? "pm" : "am";

Avoiding the ternary operator, you might write:
    if (hour >= 12) {
        time_str += "pm";
    } else {
        time_str += "am";
}

使用三目运算符可以减少代码行数，上例就是一个很好的例证，但是我们的真正目的是减少读代码的时间，所以下面的情况并不适合用三目运算符：

return exponent >= 0 ? mantissa * (1 << exponent) : mantissa / (1 << -exponent);

if (exponent >= 0) {
    return mantissa * (1 << exponent);
} else {
    return mantissa / (1 << -exponent);
}

所以只在简单表达式的地方用。

####避免使用do/while表达式

do {
    continue;
} while (false);

这段代码会执行几遍呢，需要时间思考一下，do/while完全可以用别的方法代替，所以应避免使用。

####尽早return

public boolean Contains(String str, String substr) {
    if (str == null || substr == null) return false;
    if (substr.equals("")) return true;
    ...
}

函数里面尽早的return，可以让逻辑更加清晰。

####减少嵌套

if (user_result == SUCCESS) {
    if (permission_result != SUCCESS) {
        reply.WriteErrors("error reading permissions");
        reply.Done();
        return;
    }
    reply.WriteErrors("");
} else {
    reply.WriteErrors(user_result);
}
reply.Done();

这样一段代码，有一层的嵌套，但是看起来也会稍有迷惑，想想自己的代码，有没有类似的情况呢？可以换个思路去考虑这段代码，并且用尽早return的原则修改，看起来就舒服很多：

if (user_result != SUCCESS) {
    reply.WriteErrors(user_result);
    reply.Done();
    return;
}
if (permission_result != SUCCESS) {
    reply.WriteErrors(permission_result);
    reply.Done();
    return;
}
reply.WriteErrors("");
reply.Done();

同样的，对于有嵌套的循环，可以采用同样的办法：

for (int i = 0; i < results.size(); i++) {
    if (results[i] != NULL) {
        non_null_count++;
        if (results[i]->name != "") {
            cout << "Considering candidate..." << endl;
            ...
        }
    }
}

换一种写法，尽早return，在循环中就用continue：

for (int i = 0; i < results.size(); i++) {
    if (results[i] == NULL) continue;
    non_null_count++;

    if (results[i]->name == "") continue;
    cout << "Considering candidate..." << endl;
    ... 
}

###拆分复杂表达式很显然的，越复杂的表达式，读起来越费劲，所以应该把那些复杂而庞大的表达式，拆分成一个个易于理解的小式子。

####用变量将复杂表达式拆分最简单的办法，就是增加一个变量：

if line.split(':')[0].strip() == "root":

//用变量替换
username = line.split(':')[0].strip() 
if username == "root":
    ...

或者这个例子：

if (request.user.id == document.owner_id) {
    // user can edit this document...
}
...
if (request.user.id != document.owner_id) {
// document is read-only...
}

//用变量替换
final boolean user_owns_document = (request.user.id == document.owner_id);
if (user_owns_document) {
    // user can edit this document...
}
...
if (!user_owns_document) {
    // document is read-only...
}

####逻辑替换

1) not (a or b or c) <–> (not a) and (not b) and (not c)
2) not (a and b and c) <–> (not a) or (not b) or (not c)

所以，就可以这样写：

if (!(file_exists && !is_protected)) Error("Sorry, could not read file.");

//替换
if (!file_exists || is_protected) Error("Sorry, could not read file.");

####不要滥用逻辑表达式

assert((!(bucket = FindBucket(key))) || !bucket->IsOccupied());

这样的代码完全可以用下面这个替换，虽然有两行，但是更易懂：

bucket = FindBucket(key);
if (bucket != NULL) assert(!bucket->IsOccupied());

像下面这样的表达式，最好也不要写，因为在有些语言中，x会被赋予第一个为true的变量的值：

x = a || b || c

####拆解大表达式

var update_highlight = function (message_num) {
    if ($("#vote_value" + message_num).html() === "Up") {
        $("#thumbs_up" + message_num).addClass("highlighted");
        $("#thumbs_down" + message_num).removeClass("highlighted");
    } else if ($("#vote_value" + message_num).html() === "Down") {
        $("#thumbs_up" + message_num).removeClass("highlighted");
        $("#thumbs_down" + message_num).addClass("highlighted");
    } else {
        $("#thumbs_up" + message_num).removeClass("highighted");
        $("#thumbs_down" + message_num).removeClass("highlighted");
    }
};

这里面有很多重复的语句，我们可以用变量还替换简化：

var update_highlight = function (message_num) {
    var thumbs_up = $("#thumbs_up" + message_num);
    var thumbs_down = $("#thumbs_down" + message_num);
    var vote_value = $("#vote_value" + message_num).html();
    var hi = "highlighted";

    if (vote_value === "Up") {
        thumbs_up.addClass(hi);
        thumbs_down.removeClass(hi);
    } else if (vote_value === "Down") {
        thumbs_up.removeClass(hi);
        thumbs_down.addClass(hi);
    } else {
        thumbs_up.removeClass(hi);
        thumbs_down.removeClass(hi);
    }
}

###变量与可读性

####消除变量前一节，讲到利用变量来拆解大表达式，这一节来讨论如何消除多余的变量。

####没用的临时变量

now = datetime.datetime.now()
root_message.last_view_time = now

这里的now可以去掉，因为：

并非用来拆分复杂的表达式
也没有增加可读性，因为`datetime.datetime.now()`本就清晰
只用了一次

所以完全可以写作：

root_message.last_view_time = datetime.datetime.now()

####消除条件控制变量

boolean done = false;
while (/* condition */ && !done) {
    ...
    if (...) {
        done = true;
        continue; 
    }
}

这里的done可以用别的方式更好的完成：

while (/* condition */) {
    ...
    if (...) {
        break;
    } 
}

这个例子非常容易修改，如果是比较复杂的嵌套，break可能并不够用，这时候就可以把代码封装到函数中。

####减少变量的作用域我们都听过要避免使用全局变量这样的忠告，是的，当变量的作用域越大，就越难追踪，所以要保持变量小的作用域。

class LargeClass {
    string str_;
    void Method1() {
        str_ = ...;
        Method2();
    }
    void Method2() {
        // Uses str_
    }
    // Lots of other methods that don't use str_ 
    ... ;
}

这里的str_的作用域有些大，完全可以换一种方式：

class LargeClass {
    void Method1() {
        string str = ...;
        Method2(str); 
    }
    void Method2(string str) {
        // Uses str
    }
    // Now other methods can't see str.
};

将str通过变量函数参数传递，减小了作用域，也更易读。同样的道理也可以用在定义类的时候，将大类拆分成一个个小类。

####不要使用嵌套的作用域

# No use of example_value up to this point.
if request:
    for value in request.values:
    if value > 0:
        example_value = value 
        break

for logger in debug.loggers:
    logger.log("Example:", example_value)

这个例子在运行时候会报example_value is undefined的错，修改起来不算难：

example_value = None
if request:
    for value in request.values:
        if value > 0: example_value = value 
        break

if example_value:
    for logger in debug.loggers:
    logger.log("Example:", example_value)

但是参考前面的消除中间变量准则，还有更好的办法：

def LogExample(value):
    for logger in debug.loggers:
        logger.log("Example:", value)

    if request:
        for value in request.values:
            if value > 0:
                LogExample(value)  # deal with 'value' immediately
                break

####用到了再声明在C语言中，要求将所有的变量事先声明，这样当用到变量较多时候，读者处理这些信息就会有难度，所以一开始没用到的变量，就暂缓声明：

def ViewFilteredReplies(original_id):
    filtered_replies = []
    root_message = Messages.objects.get(original_id) 
    all_replies = Messages.objects.select(root_id=original_id)
    root_message.view_count += 1
    root_message.last_view_time = datetime.datetime.now()
    root_message.save()

    for reply in all_replies:
        if reply.spam_votes <= MAX_SPAM_VOTES:
            filtered_replies.append(reply)

    return filtered_replies

读者一次处理变量太多，可以暂缓声明：

def ViewFilteredReplies(original_id):
    root_message = Messages.objects.get(original_id)
    root_message.view_count += 1
    root_message.last_view_time = datetime.datetime.now()
    root_message.save()

    all_replies = Messages.objects.select(root_id=original_id) 
    filtered_replies = []
    for reply in all_replies:
        if reply.spam_votes <= MAX_SPAM_VOTES:
            filtered_replies.append(reply)

    return filtered_replies

####变量最好只写一次前面讨论了过多的变量会让读者迷惑，同一个变量，不停的被赋值也会让读者头晕，如果变量变化的次数少一些，代码可读性就更强。

####一个例子假设有一个页面，如下，需要给第一个空的input赋值：

<input type="text" id="input1" value="Dustin">
<input type="text" id="input2" value="Trevor">
<input type="text" id="input3" value="">
<input type="text" id="input4" value="Melissa">
...
var setFirstEmptyInput = function (new_value) {
    var found = false;
    var i = 1;
    var elem = document.getElementById('input' + i);
    while (elem !== null) {
        if (elem.value === '') {
            found = true;
            break; 
        }
        i++;
        elem = document.getElementById('input' + i);
    }
    if (found) elem.value = new_value;
    return elem;
};

这段代码能工作，有三个变量，我们逐一去看如何优化，found作为中间变量，完全可以消除：

var setFirstEmptyInput = function (new_value) {
    var i = 1;
    var elem = document.getElementById('input' + i);
    while (elem !== null) {
        if (elem.value === '') {
            elem.value = new_value;
            return elem;
        }
        i++;
        elem = document.getElementById('input' + i);
    }
    return null;
};

再来看elem变量，只用来做循环，调用了很多次，所以很难跟踪他的值，i也可以用for来修改：

var setFirstEmptyInput = function (new_value) {
    for (var i = 1; true; i++) {
        var elem = document.getElementById('input' + i);
        if (elem === null)
            return null;  // Search Failed. No empty input found.
        if (elem.value === '') {
            elem.value = new_value;
            return elem;
        }
    }
};

##重新组织你的代码 ###分离不相关的子问题工程师就是将大问题分解为一个个小问题，然后逐个解决，这样也易于保证程序的健壮性、可读性。如何分解子问题，下面给出一些准则：

看看这个方法或代码，问问你自己“这段代码的最终目标是什么？”
对于每一行代码，要问“它与目标直接相关，或者是不相关的子问题？”
如果有足够多行的代码是处理与目标不直接相关的问题，那么抽离成子函数

来看一个例子：

ajax_post({
    url: 'http://example.com/submit',
    data: data,
    on_success: function (response_data) {
        var str = "{\n";
        for (var key in response_data) {
            str += "  " + key + " = " + response_data[key] + "\n";
        }
        alert(str + "}");
        // Continue handling 'response_data' ...
    }
});

这段代码的目标是发送一个ajax请求，所以其中字符串处理的部分就可以抽离出来：

var format_pretty = function (obj) {
    var str = "{\n";
    for (var key in obj) {
        str += "  " + key + " = " + obj[key] + "\n";
    }
    return str + "}";
};

####意外收获有很多理由将format_pretty抽离出来，这些独立的函数可以很容易的添加feature，增强可靠性，处理边界情况，等等。所以这里，可以将format_pretty增强，就会得到一个更强大的函数：

var format_pretty = function (obj, indent) {
    // Handle null, undefined, strings, and non-objects.
    if (obj === null) return "null";
    if (obj === undefined) return "undefined";
    if (typeof obj === "string") return '"' + obj + '"';
    if (typeof obj !== "object") return String(obj);
    if (indent === undefined) indent = "";

    // Handle (non-null) objects.

    var str = "{\n";
    for (var key in obj) {
        str += indent + "  " + key + " = ";
        str += format_pretty(obj[key], indent + " ") + "\n"; }
    return str + indent + "}";
};

这个函数输出：

{
    key1 = 1
    key2 = true
    key3 = undefined
    key4 = null
    key5 = {
        key5a = {
            key5a1 = "hello world"
        }
    }
}

多做这样的事情，就是积累代码的过程，这样的代码可以复用，也可以形成自己的代码库，或者分享给别人。

####业务相关的函数那些与目标不相关函数，抽离出来可以复用，与业务相关的也可以抽出来，保持代码的易读性，例如：

business = Business()
business.name = request.POST["name"]

url_path_name = business.name.lower()
url_path_name = re.sub(r"['\.]", "", url_path_name) 
url_path_name = re.sub(r"[^a-z0-9]+", "-", url_path_name) 
url_path_name = url_path_name.strip("-")
business.url = "/biz/" + url_path_name

business.date_created = datetime.datetime.utcnow() 
business.save_to_database()

抽离出来，就好看很多：

CHARS_TO_REMOVE = re.compile(r"['\.']+")
CHARS_TO_DASH = re.compile(r"[^a-z0-9]+")

def make_url_friendly(text):
    text = text.lower()
    text = CHARS_TO_REMOVE.sub('', text) 
    text = CHARS_TO_DASH.sub('-', text) 
    return text.strip("-")

business = Business()
business.name = request.POST["name"]
business.url = "/biz/" + make_url_friendly(business.name) 
business.date_created = datetime.datetime.utcnow() 
business.save_to_database()

####简化现有接口我们来看一个读写cookie的函数：

var max_results;
var cookies = document.cookie.split(';');
for (var i = 0; i < cookies.length; i++) {
    var c = cookies[i];
    c = c.replace(/^[ ]+/, '');  // remove leading spaces
    if (c.indexOf("max_results=") === 0)
        max_results = Number(c.substring(12, c.length));
}

这段代码实在太丑了，理想的接口应该是这样的：

set_cookie(name, value, days_to_expire);
delete_cookie(name);

对于并不理想的接口，你永远可以用自己的函数做封装，让接口更好用。

####按自己需要写接口

ser_info = { "username": "...", "password": "..." }
user_str = json.dumps(user_info)
cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE)
encrypted_bytes = cipher.update(user_str)
encrypted_bytes += cipher.final() # flush out the current 128 bit block
url = "http://example.com/?user_info=" + base64.urlsafe_b64encode(encrypted_bytes)
...

虽然终极目的是拼接用户信息的字符，但是代码大部分做的事情是解析python的object，所以：

def url_safe_encrypt(obj):
    obj_str = json.dumps(obj)
    cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE) encrypted_bytes = cipher.update(obj_str)
    encrypted_bytes += cipher.final() # flush out the current 128 bit block
    return base64.urlsafe_b64encode(encrypted_bytes)

这样在其他地方也可以调用：

user_info = { "username": "...", "password": "..." }
url = "http://example.com/?user_info=" + url_safe_encrypt(user_info)

分离子函数是好习惯，但是也要适度，过度的分离成多个小函数，也会让查找变得困难。

###单任务代码应该是一次只完成一个任务

var place = location_info["LocalityName"];  // e.g. "Santa Monica"
if (!place) {
    place = location_info["SubAdministrativeAreaName"];  // e.g. "Los Angeles"
}
if (!place) {
    place = location_info["AdministrativeAreaName"];  // e.g. "California"
}
if (!place) {
    place = "Middle-of-Nowhere";
}
if (location_info["CountryName"]) {
    place += ", " + location_info["CountryName"];  // e.g. "USA"
} else {
    place += ", Planet Earth";
}

return place;

这是一个用来拼地名的函数，有很多的条件判断，读起来非常吃力，有没有办法拆解任务呢？

var town    = location_info["LocalityName"];               // e.g. "Santa Monica"
var city    = location_info["SubAdministrativeAreaName"];  // e.g. "Los Angeles"
var state   = location_info["AdministrativeAreaName"];     // e.g. "CA"
var country = location_info["CountryName"];                // e.g. "USA"

先拆解第一个任务，将各变量分别保存，这样在后面使用中不需要去记忆那些繁长的key值了，第二个任务，解决地址拼接的后半部分：

// Start with the default, and keep overwriting with the most specific value. var second_half = "Planet Earth";
if (country) {
    second_half = country; 
}
if (state && country === "USA") {
    second_half = state; 
}

再来解决前半部分：

var first_half = "Middle-of-Nowhere";
if (state && country !== "USA") {
    first_half = state; 
}
if (city) {
    first_half = city;
}
if (town) {
    first_half = town; 
}

大功告成：

return first_half + ", " + second_half;

如果注意到有USA这个变量的判断的话，也可以这样写：

var first_half, second_half;
if (country === "USA") {
    first_half = town || city || "Middle-of-Nowhere";
    second_half = state || "USA";
} else {
    first_half = town || city || state || "Middle-of-Nowhere";
    second_half = country || "Planet Earth";
}
return first_half + ", " + second_half;

###把想法转换成代码要把一个复杂的东西解释给别人，一些细节很容易就让人产生迷惑，所以想象把你的代码用平实的语言解释给别人听，别人是否能懂，有一些准则可以帮助你让代码更清晰：

用最平实的语言描述代码的目的，就像给读者讲述一样
注意描述中关键的字词
让你的代码符合你的描述

下面这段代码用来校验用户的权限：

$is_admin = is_admin_request();
if ($document) {
    if (!$is_admin && ($document['username'] != $_SESSION['username'])) {
        return not_authorized();
    }
} else {
    if (!$is_admin) {
        return not_authorized();
    } 
}
// continue rendering the page ...

这一段代码不长，里面的逻辑嵌套倒是复杂，参考前面章节所述，嵌套太多非常影响阅读理解，将这个逻辑用语言描述就是：

有两种情况有权限：
1、你是管理员(admin)
2、你拥有这个文档
否则就没有权限

根据描述来写代码：

if (is_admin_request()) {
    // authorized
} elseif ($document && ($document['username'] == $_SESSION['username'])) {
    // authorized
} else {
    return not_authorized();
}
// continue rendering the page ...

###写更少的代码最易懂的代码就是没有代码！

去掉那些没意义的feature，也不要过度设计
重新考虑需求，解决最简单的问题，也能完成整体的目标
熟悉你常用的库，周期性研究他的API

##最后还有一些与测试相关的章节，留给你自己去研读吧，再次推荐此书：

重构

Blog

Opinion

Project