帶你探究 Go 語言的 iota

爲什麼會探究 iota 呢？

這就要從我最近遇到的一次 protobuf 解析報錯說起，錯誤信息如下：

cannot parse reserved wire type

所以有關這個錯誤的原因，可以查看👇這篇文章

https://stackoverflow.com/questions/62033409/cannot-find-solutions-to-a-protobuf-unmarshal-error

在查看這個錯誤的源碼定義時，發現它們是一組 const iota 定義，而 const 值是通過 -iota 來賦值的，具體代碼：protobuf/encoding/protowire/wire.go

from: google.golang.org/protobuf/encoding/protowire/wire.go

const (
 _ = -iota
 errCodeTruncated
 errCodeFieldNumber
 errCodeOverflow
 errCodeReserved
 errCodeEndGroup
)

說實話，我用 Go 這麼久，還是第一次見 -iota，所以我就一下子來了興趣，特別想要了解它，首先想到的就是去查 go ref/spec https://golang.org/ref/spec#Iota，但是文中並沒有提到 -iota 的用法。

爲了瞭解清楚 iota ，所以我就開始探究，並將我探究的過程記錄下來，拋磚引玉，希望可以跟大家一起探討。

iota 由來

Iota （大寫 I 小寫 l，中文音譯：約塔），是一個全名，而不是某一組詞的縮寫。iota 是希臘字母中的第 9 個字母。

希臘字母 iota 的詞源（Etymology）

Iota From Ancient Greek ἰῶτα (iôta). (jot): In reference to a phrase in the New Testament: "until heaven and earth pass away, not an iota, not a dot, will pass from the Law" (Mt 5:18), iota being the smallest letter of the Greek alphabet.

維基百科

網上還有以下的闡述（但是我並不瞭解，所以無法考證，保留至此，以供大家參考）

iota 是典型的數學符號，它所表示的含義如下：1. 作爲求和算法的迭代器 2. 作爲下標索引 3. 對於複數的虛部

希臘字母 iota 在編程語言 APL 中用於生成連續整數序列（可以參考閱讀 iota Scheme 文獻）。

APL 是 A Programming Language 或 Array Processing Language 的縮寫。肯尼斯・艾佛森在 1962 年設計這個語言時他正在哈佛大學工作，1979 年他因對數學表達式和編程語言理論的貢獻而得到圖靈獎。在過去數十年的使用歷史中，APL 從它的原始版本開始不斷改變和發展，今天的版本與 1963 年發表時的版本已經非常不一樣了。但它始終是一種解釋執行的計算機語言。

維基百科

在 C++ 及其他語言中也有類似於 Go iota 的用法。

Go iota 究竟有何用處

可以被當做 enum 來使用，在 const 塊中，默認值爲 0，即第一行爲 0，以後每一行加 1；

Go 中 iota 的使用大致是基於 APL 中的定義來實現的。

https://en.wikipedia.org/wiki/Iota

iota 的用法 / 注意事項 / 探究

不同 const 定義塊互不干擾
所有註釋行和空行全部忽略
從第 1 行開始，iota 從 0 逐行加 1

接下來我們來看一個 Go issues 上提到的一些有關 iota 使用的例子

https://github.com/golang/go/issues/39751

type myConst int

const (
        zero myConst = iota // iota = 0
        one                 // iota = 1
        three = iota + 1    // iota = 2
        foure               // iota = 3
        five = iota + 1     // iota = 4
)

func testIota() {
        // 3 4 5 why not 3 4 6
        fmt.Println(three, foure, five)
}

我們知道 iota 的值是從第一行 0 ，開始逐行遞增的，對應 iota + 1 只是普通的表達式計算，對於 iota 來說是沒有影響的。

我們也可以通過修改源代碼，然後再編譯爲新的 go 來執行我們的程序，修改代碼塊爲：src/cmd/compile/internal/noder/noder.go#L456 constDecl() 在 cs.iota++ 之前打印，得到以下值，也印證了我上面說的話。

constState.iota:  0

constState.iota: 1

constState.iota:  2

constState.iota:  3

constState.iota:  4

iota 能不能被用於普通變量申明？

Robert Griesemer 在 2017 年 8 月 16 日提了一個 proposal 想要在 Go2 做這件事。

https://github.com/golang/go/issues/21473

-iota 是什麼❓

這個用法我們可以直接查看 Go 標準包，src/text/scanner/scanner.go:

// The result of Scan is one of these tokens or a Unicode character.
const (
 EOF = -(iota + 1)
 Ident
 Int
 Float
 Char
 String
 RawString
 Comment

 // internal use only
 skipComment
)

如果你不通過測試，你知道 Ident 的值爲 -2 嗎？

還有 src/cmd/asm/internal/arch/arch.go:

// Pseudo-registers whose names are the constant name without the leading R.
const (
 RFP = -(iota + 1)
 RSB
 RSP
 RPC
)

以及 src/cmd/compile/internal/gc/bexport.go：

// Tags. Must be < 0.
const (
 // Objects
 packageTag = -(iota + 1)
 constTag
 typeTag
 varTag
 funcTag
 endTag

 // Types
 namedTag
 arrayTag
 sliceTag
 dddTag
 structTag
 pointerTag
 signatureTag
 interfaceTag
 mapTag
 chanTag

 // Values
 falseTag
 trueTag
 int64Tag
 floatTag
 fractionTag // not used by gc
 complexTag
 stringTag
 nilTag
 unknownTag // not used by gc (only appears in packages with errors)

 // Type aliases
 aliasTag
)

-iota 是對 iota 增加了一個負號，算是一個表達式，所以對應的結果僅僅是做了一個負值計算。

iota 源碼級探究

非常感謝歐神對此處源碼閱讀的指導，大家如果對 Go 源碼感興趣，可以直接讀歐神寫的書📚《Go 語言原本》

在線閱讀地址：https://golang.design/under-the-hood/

爲什麼 -iota 之後的 const 值是在遞減呢？

https://golang.design/gossa?id=b49e9104-3750-4a47-ba55-e491489d8cf1

在 src/cmd/compile/internal/ir 下面有一個 ConstExpr

type ConstExpr struct {
 miniExpr
 origNode
 val constant.Value
}

注意：此處的 ConstExpr 是在 Go 倉庫 master 上纔有，在當前 go1.16.3 是沒有這個定義的。

額外補充有關 ir 重構相關的內容

爲什麼要重構一個 ir 出來，我們可以根據 git commit history 追溯得知

如果要完全分解 gc 包，則需要將其定義的編譯器 IR 移到一個單獨的包中，該包可以由 gc 本身導入的包導入。

https://go-review.googlesource.com/c/go/+/273008
https://github.com/golang/go/commit/84e2bd611f9b62ec3b581f8a0d932dc4252ceb67#diff-48abec9cb23c79fc5e7fa7f2b4a81e079a9d405d257bbd72758e983506244c99

爲什麼要重構爲 ConstExpr，我們可以在 CL 上看到描述：

https://go-review.googlesource.com/c/go/+/275033
https://github.com/golang/go/commit/a2058bac21f40925a33d7f99622c967b65827f29#diff-4d97274cfa5251d986ad72fc0a75c0f3a907d65b92426cbce4e6328821afc104
之前用 Name 表示常量摺疊的表達式，不夠優化，因爲 Name 具有很多字段來支持聲明的名稱（與常量摺疊的表達式無關），而常量表達式則相當簡潔。
輕量級 ConstExpr 類型，可以簡單地包裝現有表達式並將其與值相關聯。

言歸正傳，我們繼續來看 iota 是如何計算的。

最容易想到的定位方法，就是全局檢索： iota 然後查看其賦值即可，但是現實是遠沒有我們想象的那麼簡單。。。

// constState tracks state between constant specifiers within a
// declaration group. This state is kept separate from noder so nested
// constant declarations are handled correctly (e.g., issue 15550).
type constState struct {
 group  *syntax.Group
 typ    ir.Ntype
 values []ir.Node
 iota   int64
}

入口函數 src/cmd/compile/main.go,

func main() {
 gc.Main(archInit)
}

gc.Main: src/cmd/compile/internal/gc/main.go :

func Main(archInit func(*ssagen.ArchInfo)) {  
  ...
  // Parse and typecheck input.
 noder.LoadPackage(flag.Args())
  ...
}

LoadPackage 的僞代碼：

func LoadPackage(filenames []string) {
 ...
 for _, p := range noders {
  p.node()
  p.file = nil // release memory
 }
 ...
 // Process top-level declarations in phases.
 // Phase 1: const, type, and names and types of funcs.
 //   This will gather all the information about types
 //   and methods but doesn't depend on any of it.
 //
 //   We also defer type alias declarations until phase 2
 //   to avoid cycles like #18640.
 //   TODO(gri) Remove this again once we have a fix for #25838.

 ...
 // Phase 2: Variable assignments.
 //   To check interface assignments, depends on phase 1.

 ...
 // Phase 3: Type check function bodies.

 ...
 // Phase 4: Check external declarations.
 // TODO(mdempsky): This should be handled when type checking their
 // corresponding ODCL nodes.
 ...
 // Phase 5: With all user code type-checked, it's now safe to verify map keys.
 // With all user code typechecked, it's now safe to verify unused dot imports.
}

在 Process top-level declarations in phases 之前的 p.node () 主要就幹了一件事：

typecheck.Target.Decls = append(typecheck.Target.Decls, p.decls(p.file.DeclList)...)

對於 const 來說， decls 就是執行 func (p *noder) constDecl(decl *syntax.ConstDecl, cs *constState) []ir.Node

func (p *noder) constDecl(decl *syntax.ConstDecl, cs *constState) []ir.Node {
 ...

 n.SetIota(cs.iota)
 ...

 cs.iota++

 return nn
}

從而得到了我們的 iota 爲一個遞增的值。

而 typecheck 在 go 1.16.3 是在以上幾個階段的時候實時執行的，而在 master 上，已經不是這樣實現了。

func typecheck(n ir.Node, top int) (res ir.Node) {
 ...
 // Resolve definition of name and value of iota lazily.
 n = Resolve(n)
 ...
 n = typecheck1(n, top)
 ...
 if t != nil {
  n = EvalConst(n)
  t = n.Type()
 }
 ...
 return n
}

我們拆解以上邏輯塊：

Resolve 得到 iota 的值，注意這裏的值其實是正值

// Resolve ONONAME to definition, if any.
func Resolve(n ir.Node) (res ir.Node) {
 ...
 if r.Op() == ir.OIOTA {
    if x := getIotaValue(); x >= 0 {
       return ir.NewInt(x)
    }
    return n
 }
 ...
}

getIotaValue() 是從 typecheckdefstack 中取最後一個 ir.Name，然後取 Offset_ 的值。

// type checks the whole tree of an expression.
// calculates expression types.
// evaluates compile time constants.
// marks variables that escape the local frame.
// rewrites n.Op to be more specific in some cases.

var typecheckdefstack []*ir.Name
...

// getIotaValue returns the current value for "iota",
// or -1 if not within a ConstSpec.
func getIotaValue() int64 {
 if i := len(typecheckdefstack); i > 0 {
  if x := typecheckdefstack[i-1]; x.Op() == ir.OLITERAL {
   return x.Iota()
  }
 }
}

ir.Name 中的 Offset_ 的相關操作：

func (n *Name) Iota() int64            { return n.Offset_ }
func (n *Name) SetIota(x int64)        { n.Offset_ = x }

typecheck1 進行了一系列的猛操作就是爲 EvalConst 鋪路。
EvalConst 邏輯塊：

https://github.com/golang/go/blob/2ebe77a2fda1ee9ff6fd9a3e08933ad1ebaea039/src/cmd/compile/internal/typecheck/const.go#L395

// EvalConst returns a constant-evaluated expression equivalent to n.
// If n is not a constant, EvalConst returns n.
// Otherwise, EvalConst returns a new OLITERAL with the same value as n,
// and with .Orig pointing back to n.
func EvalConst(n ir.Node) ir.Node {
 // Pick off just the opcodes that can be constant evaluated.
 switch n.Op() {
  // 而構建 Const 的關鍵就是 tokenForOp[n.Op()]，到底是 + 還是 -
  return OrigConst(...)
 }
 ...
}

而 constant.UnaryOp 的代碼就是根據 token.SUB (-) 計算值的：

// UnaryOp returns the result of the unary expression op y.
// The operation must be defined for the operand.
// If prec > 0 it specifies the ^ (xor) result size in bits.
// If y is Unknown, the result is Unknown.
//
func UnaryOp(op token.Token, y Value, prec uint) Value {
 ...
 case token.SUB:
  switch y := y.(type) {
  case unknownVal:
   return y
  case int64Val:
   if z := -y; z != y {
    return z // no overflow
   }
   return makeInt(newInt().Neg(big.NewInt(int64(y))))
  case intVal:
   return makeInt(newInt().Neg(y.val))
  case ratVal:
   return makeRat(newRat().Neg(y.val))
  case floatVal:
   return makeFloat(newFloat().Neg(y.val))
  case complexVal:
   re := UnaryOp(token.SUB, y.re, 0)
   im := UnaryOp(token.SUB, y.im, 0)
   return makeComplex(re, im)
  }
 ...
}

終上所述，我們就大概瞭解其 Go iota 的遞增、遞減過程了。

有關 go 支持 enum 的說明和 issues

https://github.com/golang/go/issues/28987#issuecomment-497108307

Iota 或 enum 相關的 issues
https://github.com/golang/go/issues/28987
https://github.com/golang/go/issues/28438
https://github.com/golang/go/issues/21473
https://github.com/golang/go/issues/19814
https://github.com/golang/go/issues/39751

參考資料

https://zh.wikipedia.org/wiki/Ιota
https://github.com/golang/go/wiki/Iota
https://golang.org/ref/spec#Iota
https://stackoverflow.com/questions/31650192/whats-the-full-name-for-iota-in-golang
翻譯：https://cn.cosmicbeach2k.com/638772-whats-the-full-name-for-KUVIJR
https://stackoverflow.com/questions/28411850/why-is-it-called-iota
https://blog.learngoprogramming.com/golang-const-type-enums-iota-bc4befd096d3
https://blog.wolfogre.com/posts/golang-iota/

文中所涉及的 CL:

https://go-review.googlesource.com/c/go/+/273008
https://go-review.googlesource.com/c/go/+/275033

本文由 Readfog 進行 AMP 轉碼，版權歸原作者所有。
來源：https://mp.weixin.qq.com/s/V5iJgcLhCYMZd5l8akfbVQ