Go語言中[]byte和string類型相互轉換時的性能分析和優化

歡迎訪問我的個人網站獲取更佳排版體驗： https://pengrl.com/p/31544/

我們在使用Go語言時，經常涉及到[]byte和string兩種類型間的轉換。本篇文章將討論轉換時的開銷，Go編譯器在一些特定場景下對轉換做的優化，以及在高性能場景下，我們自己如何做相應的優化。

[]byte其實就是byte類型的切片，對應的底層結構體定義如下（在runtime/slice.go文件中）

type slice struct { array unsafe.Pointer len int cap int }

string對應的底層結構體定義如下（在runtime/string.go文件中）

type stringStruct struct { str unsafe.Pointer len int }

可以看到它們內部都有一個指針類型(array或str)，指向真實數據。另外還有一個len欄位，標識數據的長度。 slice多了一個cap欄位，表示容量大小。當要往slice尾部追加數據而空餘容量又不夠時，會重新分配更大的內存塊，將當前內存塊的內容拷貝至新內存塊，再在新內存塊做追加。

slice變數間做賦值操作時，只是修改指針指向，不會拷貝真實數據。string變數間賦值也是同樣的道理。

但是[]byte和string相互轉換，就需要重新申請內存並拷貝內存了。因為Go語義中，slice的內容是可變的（mutable），而string是不可變的（immutable）。如果他們底部指向同一塊數據，那麼由於slice可對數據做修改，string就做不到immutable了。

[]byte和string互轉時的底層調用分別對應runtime/string.go中stringtoslicebyte和slicebytetostring兩個函數。

那麼如果我們想省去申請和拷貝內存的開銷呢？來看runtime/string.go中slicebytetostringtmp和stringtoslicebytetmp兩個函數，如下：

func slicebytetostringtmp(b []byte) string { // Return a "string" referring to the actual []byte bytes. // This is only for use by internal compiler optimizations // that know that the string form will be discarded before // the calling goroutine could possibly modify the original // slice or synchronize with another goroutine. // First such case is a m[string(k)] lookup where // m is a string-keyed map and k is a []byte. // Second such case is "<"+string(b)+">" concatenation where b is []byte. // Third such case is string(b)=="foo" comparison where b is []byte. if raceenabled && len(b) > 0 { racereadrangepc(unsafe.Pointer(&b[0]), uintptr(len(b)), getcallerpc(unsafe.Pointer(&b)), funcPC(slicebytetostringtmp)) } return *(*string)(unsafe.Pointer(&b)) }

func stringtoslicebytetmp(s string) []byte {
// Return a slice referring to the actual string bytes.
// This is only for use by internal compiler optimizations
// that know that the slice wont be mutated.
// The only such case today is:
// for i, c := range []byte(str)

str := (*stringStruct)(unsafe.Pointer(&s))
ret := slice{array: unsafe.Pointer(str.str), len: str.len, cap: str.len}
return *(*[]byte)(unsafe.Pointer(&ret))
}