201-250 of 10000 results (74ms)
2024-04-08 ยง
14:48 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance [production]
14:48 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance [production]
14:48 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance [production]
14:48 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2113 (T360332)', diff saved to https://phabricator.wikimedia.org/P59855 and previous config saved to /var/cache/conftool/dbconfig/20240408-144808-arnaudb.json [production]
14:47 <jayme@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
14:47 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db1221 (T360332)', diff saved to https://phabricator.wikimedia.org/P59854 and previous config saved to /var/cache/conftool/dbconfig/20240408-144738-arnaudb.json [production]
14:47 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
14:47 <jayme@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
14:47 <jayme@deploy1002> helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
14:47 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
14:47 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance [production]
14:47 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance [production]
14:47 <jayme@deploy1002> helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
14:46 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1199 (T360332)', diff saved to https://phabricator.wikimedia.org/P59853 and previous config saved to /var/cache/conftool/dbconfig/20240408-144657-arnaudb.json [production]
14:44 <jayme@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'. [production]
14:43 <jayme@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'. [production]
14:43 <jayme@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. [production]
14:42 <pfischer@deploy1002> helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
14:42 <pfischer@deploy1002> helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply [production]
14:42 <jayme@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. [production]
14:41 <jayme@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'. [production]
14:40 <jayme@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'. [production]
14:39 <jayme@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:38 <jayme@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:37 <godog> bounce thanos-query and thanos-store on titan1002 - stuck on high CPU [production]
14:33 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2113', diff saved to https://phabricator.wikimedia.org/P59852 and previous config saved to /var/cache/conftool/dbconfig/20240408-143301-arnaudb.json [production]
14:31 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P59851 and previous config saved to /var/cache/conftool/dbconfig/20240408-143149-arnaudb.json [production]
14:24 <sukhe@cumin1002> START - Cookbook sre.hosts.reimage for host cp3069.esams.wmnet with OS bullseye [production]
14:20 <jayme@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
14:19 <sukhe> depool cp3069 to prepare for reimaging: T360430 [production]
14:19 <jayme@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
14:17 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2113', diff saved to https://phabricator.wikimedia.org/P59850 and previous config saved to /var/cache/conftool/dbconfig/20240408-141753-arnaudb.json [production]
14:16 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P59849 and previous config saved to /var/cache/conftool/dbconfig/20240408-141641-arnaudb.json [production]
14:06 <isaranto@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
14:04 <isaranto@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
14:02 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2113 (T360332)', diff saved to https://phabricator.wikimedia.org/P59847 and previous config saved to /var/cache/conftool/dbconfig/20240408-140246-arnaudb.json [production]
14:01 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1199 (T360332)', diff saved to https://phabricator.wikimedia.org/P59846 and previous config saved to /var/cache/conftool/dbconfig/20240408-140132-arnaudb.json [production]
14:00 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2113 (T360332)', diff saved to https://phabricator.wikimedia.org/P59845 and previous config saved to /var/cache/conftool/dbconfig/20240408-135926-arnaudb.json [production]
13:59 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db1199 (T360332)', diff saved to https://phabricator.wikimedia.org/P59844 and previous config saved to /var/cache/conftool/dbconfig/20240408-135915-arnaudb.json [production]
13:59 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance [production]
13:59 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance [production]
13:59 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance [production]
13:59 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance [production]
13:58 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1190 (T360332)', diff saved to https://phabricator.wikimedia.org/P59843 and previous config saved to /var/cache/conftool/dbconfig/20240408-135852-arnaudb.json [production]
13:57 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance [production]
13:57 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance [production]
13:56 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance [production]
13:56 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance [production]
13:55 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance [production]
13:55 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance [production]